Commit Graph

114441 Commits

Author SHA1 Message Date
Marc Zyngier
5d4c9bc776 irqdomain: Use irq_domain_get_of_node() instead of direct field access
The struct irq_domain contains a "struct device_node *" field
(of_node) that is almost the only link between the irqdomain
and the device tree infrastructure.

In order to prepare for the removal of that field, convert all
users to use irq_domain_get_of_node() instead.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-and-tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Graeme Gregory <graeme@xora.org.uk>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Link: http://lkml.kernel.org/r/1444737105-31573-2-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-13 19:01:23 +02:00
Thomas Gleixner
e50226b4b8 Merge branch 'linus' into irq/core
Bring in upstream updates for patches which depend on them
2015-10-13 19:00:14 +02:00
Paolo Bonzini
7391773933 KVM: x86: fix SMI to halted VCPU
An SMI to a halted VCPU must wake it up, hence a VCPU with a pending
SMI must be considered runnable.

Fixes: 64d6067057
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13 18:29:41 +02:00
Paolo Bonzini
5d9bc648b9 KVM: x86: clean up kvm_arch_vcpu_runnable
Split the huge conditional in two functions.

Fixes: 64d6067057
Cc: stable@vger.kernel.org
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13 18:28:59 +02:00
Paolo Bonzini
f0d648bdf0 KVM: x86: map/unmap private slots in __x86_set_memory_region
Otherwise, two copies (one of them never populated and thus bogus)
are allocated for the regular and SMM address spaces.  This breaks
SMM with EPT but without unrestricted guest support, because the
SMM copy of the identity page map is all zeros.

By moving the allocation to the caller we also remove the last
vestiges of kernel-allocated memory regions (not accessible anymore
in userspace since commit b74a07beed, "KVM: Remove kernel-allocated
memory regions", 2010-06-21); that is a nice bonus.

Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
Cc: stable@vger.kernel.org
Fixes: 9da0e4d5ac
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13 18:28:58 +02:00
Paolo Bonzini
1d8007bdee KVM: x86: build kvm_userspace_memory_region in x86_set_memory_region
The next patch will make x86_set_memory_region fill the
userspace_addr.  Since the struct is not used untouched
anymore, it makes sense to build it in x86_set_memory_region
directly; it also simplifies the callers.

Reported-by: Alexandre DERUMIER <aderumier@odiso.com>
Cc: stable@vger.kernel.org
Fixes: 9da0e4d5ac
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-13 18:28:46 +02:00
Tomi Valkeinen
7e381ec6a3 ARM: dts: am57xx-beagle-x15: set VDD_SD to always-on
LDO1 regulator (VDD_SD) is connected to SoC's vddshv8. vddshv8 needs to
be kept always powered (see commit 5a0f93c657 ("ARM: dts: Add
am57xx-beagle-x15"), but at the moment VDD_SD is enabled/disabled
depending on whether an SD card is inserted or not.

This patch sets LDO1 regulator to always-on.

This patch has a side effect of fixing another issue, HDMI DDC not
working when SD card is not inserted:

Why this happens is that the tpd12s015 (HDMI level shifter/ESD
protection chip) has LS_OE GPIO input, which needs to be enabled for the
HDMI DDC to work. LS_OE comes from gpio6_28. The pin that provides
gpio6_28 is powered by vddshv8, and vddshv8 comes from VDD_SD.

So when SD card is not inserted, VDD_SD is disabled, and LS_OE stays
off.

The proper fix for the HDMI DDC issue would be to maybe have the pinctrl
framework manage the pin specific power.

Apparently this fixes also a third issue (copy paste from Kishon's
patch):

ldo1_reg in addition to being connected to the io lines is also
connected to the card detect line. On card removal, omap_hsmmc
driver does a regulator_disable causing card detect line to be
pulled down. This raises a card insertion interrupt and once the
MMC core detects there is no card inserted, it does a
regulator disable which again raises a card insertion interrupt.
This happens in a loop causing infinite MMC interrupts.

Fixes: 5a0f93c657 ("ARM: dts: Add am57xx-beagle-x15")
Cc: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Reported-by: Louis McCarthy <compeoree@gmail.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-10-12 16:16:34 -07:00
Alim Akhtar
b8bb9baad2 ARM: dts: Fix audio card detection on Peach boards
Since commit 2fad972d45 ("ARM: dts: Add mclk entry for Peach boards"),
sound card detection is broken on peach boards and gives below errors:

[    3.630457] max98090 7-0010: MAX98091 REVID=0x51
[    3.634233] max98090 7-0010: use default 2.8v micbias
[    3.640985] snow-audio sound: HiFi <-> 3830000.i2s mapping ok
[    3.645307] max98090 7-0010: Invalid master clock frequency
[    3.650824] snow-audio sound: ASoC: Peach-Pi-I2S-MAX98091 late_probe() failed: -22
[    3.658914] snow-audio sound: snd_soc_register_card failed (-22)
[    3.664366] snow-audio: probe of sound failed with error -22

This patch adds missing assigned-clocks and assigned-clock-parents for
pmu_system_controller node which is used as "mclk" for audio codec.

Fixes: 2fad972d45 ("ARM: dts: Add mclk entry for Peach boards")
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
2015-10-13 04:40:27 +09:00
Krzysztof Kozlowski
51a6256b00 ARM: EXYNOS: Fix double of_node_put() when parsing child power domains
On each next iteration of for_each_compatible_node() the reference
counter for current device node is already decreased by the loop
iterator. The manual call to of_node_get() is required only on loop
break which is not happening here.

The double of_node_get() (with enabled CONFIG_OF_DYNAMIC) lead to
decreasing the counter below expected, initial value.

Fixes: fe4034a3fa ("ARM: EXYNOS: Add missing of_node_put() when parsing power domains")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
2015-10-13 04:37:17 +09:00
Manjeet Pawar
c9692657c0 arm64: Fix MINSIGSTKSZ and SIGSTKSZ
MINSIGSTKSZ and SIGSTKSZ for ARM64 are not correctly set in latest kernel.
This patch fixes this issue.

This issue is reported in LTP (testcase: sigaltstack02.c).
Testcase failed when sigaltstack() called with stack size "MINSIGSTKSZ - 1"
Since in Glibc-2.22, MINSIGSTKSZ is set to 5120 but in kernel
it is set to 2048 so testcase gets failed.

Testcase Output:
sigaltstack02 1  TPASS  :  stgaltstack() fails, Invalid Flag value,errno:22
sigaltstack02 2  TFAIL  :  sigaltstack() returned 0, expected -1,errno:12

Reported Issue in Glibc Bugzilla:
Bugfix in Glibc-2.22: [Bug 16850]
https://sourceware.org/bugzilla/show_bug.cgi?id=16850

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Akhilesh Kumar <akhilesh.k@samsung.com>
Signed-off-by: Manjeet Pawar <manjeet.p@samsung.com>
Signed-off-by: Rohit Thapliyal <r.thapliyal@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-12 17:40:12 +01:00
Will Deacon
b6dd8e0719 arm64: errata: use KBUILD_CFLAGS_MODULE for erratum #843419
Commit df057cc7b4 ("arm64: errata: add module build workaround for
erratum #843419") sets CFLAGS_MODULE to ensure that the large memory
model is used by the compiler when building kernel modules.

However, CFLAGS_MODULE is an environment variable and intended to be
overridden on the command line, which appears to be the case with the
Ubuntu kernel packaging system, so use KBUILD_CFLAGS_MODULE instead.

Cc: <stable@vger.kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: df057cc7b4 ("arm64: errata: add module build workaround for erratum #843419")
Reported-by: Dann Frazier <dann.frazier@canonical.com>
Tested-by: Dann Frazier <dann.frazier@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-12 17:40:12 +01:00
Borislav Petkov
0399f73299 x86/microcode/amd: Do not overwrite final patch levels
A certain number of patch levels of applied microcode should not
be overwritten by the microcode loader, otherwise bad things
will happen.

Check those and abort update if the current core has one of
those final patch levels applied by the BIOS. 32-bit needs
special handling, of course.

See https://bugzilla.suse.com/show_bug.cgi?id=913996 for more
info.

Tested-by: Peter Kirchgeßner <pkirchgessner@t-online.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1444641762-9437-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 16:15:48 +02:00
Borislav Petkov
2eff73c0a1 x86/microcode/amd: Extract current patch level read to a function
Pave the way for checking the current patch level of the
microcode in a core. We want to be able to do stuff depending on
the patch level - in this case decide whether to update or not.
But that will be added in a later patch.

Drop unused local var uci assignment, while at it.

Integrate a fix for 32-bit and CONFIG_PARAVIRT from Takashi Iwai:

 Use native_rdmsr() in check_current_patch_level() because with
 CONFIG_PARAVIRT enabled and on 32-bit, where we run before
 paging has been enabled, we cannot deref pv_info yet. Or we
 could, but we'd need to access its physical address. This way of
 fixing it is simpler. See:

   https://bugzilla.suse.com/show_bug.cgi?id=943179 for the background.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Takashi Iwai <tiwai@suse.com>:
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1444641762-9437-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 16:15:48 +02:00
Aravind Gopalakrishnan
fa20a2ed6f x86/ras/mce_amd_inj: Inject bank 4 errors on the NBC
Bank 4 MCEs are logged and reported only on the node base core
(NBC) in a socket. Refer to the D18F3x44[NbMcaToMstCpuEn] field
in Fam10h and later BKDGs. The node base core (NBC) is the
lowest numbered core in the node.

This patch ensures that we inject the error on the NBC for bank
4 errors. Otherwise, triggering #MC or APIC interrupts on a core
which is not the NBC would not have any effect on the system,
i.e. we would not see any relevant output on kernel logs for the
error we just injected.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Cleanup comments. ]
[ Add a missing dependency on AMD_NB caught by Randy Dunlap. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1443190851-2172-4-git-send-email-Aravind.Gopalakrishnan@amd.com
Link: http://lkml.kernel.org/r/1444641762-9437-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 16:15:48 +02:00
Aravind Gopalakrishnan
a1300e5052 x86/ras/mce_amd_inj: Trigger deferred and thresholding errors interrupts
Add the capability to trigger deferred error interrupts and
threshold interrupts in order to test the APIC interrupt handler
functionality for these type of errors.

Update README section about the same too.

Reported by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Cleanup comments. ]
[ Include asm/irq_vectors.h directly so that misc randbuilds don't fail. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1443190851-2172-3-git-send-email-Aravind.Gopalakrishnan@amd.com
Link: http://lkml.kernel.org/r/1444641762-9437-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 16:15:47 +02:00
Aravind Gopalakrishnan
85c9306d44 x86/ras/mce_amd_inj: Return early on invalid input
Invalid inputs such as these are currently reported in dmesg as
failing:

  $> echo sweet > flags
  [  122.079139] flags_write: Invalid flags value: et

even though the 'flags' attribute has been updated correctly:

  $> cat flags
  sw

This is because userspace keeps writing the remaining buffer
until it encounters an error.

However, the input as a whole is wrong and we should not be
writing anything to the file. Therefore, correct flags_write()
to return -EINVAL immediately on bad input strings.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1443190851-2172-2-git-send-email-Aravind.Gopalakrishnan@amd.com
Link: http://lkml.kernel.org/r/1444641762-9437-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 16:15:47 +02:00
Taku Izumi
0f96a99dab efi: Add "efi_fake_mem" boot option
This patch introduces new boot option named "efi_fake_mem".
By specifying this parameter, you can add arbitrary attribute
to specific memory range.
This is useful for debugging of Address Range Mirroring feature.

For example, if "efi_fake_mem=2G@4G:0x10000,2G@0x10a0000000:0x10000"
is specified, the original (firmware provided) EFI memmap will be
updated so that the specified memory regions have
EFI_MEMORY_MORE_RELIABLE attribute (0x10000):

 <original>
   efi: mem36: [Conventional Memory|  |  |  |  |  |   |WB|WT|WC|UC] range=[0x0000000100000000-0x00000020a0000000) (129536MB)

 <updated>
   efi: mem36: [Conventional Memory|  |MR|  |  |  |   |WB|WT|WC|UC] range=[0x0000000100000000-0x0000000180000000) (2048MB)
   efi: mem37: [Conventional Memory|  |  |  |  |  |   |WB|WT|WC|UC] range=[0x0000000180000000-0x00000010a0000000) (61952MB)
   efi: mem38: [Conventional Memory|  |MR|  |  |  |   |WB|WT|WC|UC] range=[0x00000010a0000000-0x0000001120000000) (2048MB)
   efi: mem39: [Conventional Memory|  |  |  |  |  |   |WB|WT|WC|UC] range=[0x0000001120000000-0x00000020a0000000) (63488MB)

And you will find that the following message is output:

   efi: Memory: 4096M/131455M mirrored memory

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2015-10-12 14:20:09 +01:00
Taku Izumi
0bbea1ce98 x86/efi: Rename print_efi_memmap() to efi_print_memmap()
This patch renames print_efi_memmap() to efi_print_memmap() and
make it global function so that we can invoke it outside of
arch/x86/platform/efi/efi.c

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2015-10-12 14:20:08 +01:00
Matt Fleming
ae2ee627dc efifb: Add support for 64-bit frame buffer addresses
The EFI Graphics Output Protocol uses 64-bit frame buffer addresses
but these get truncated to 32-bit by the EFI boot stub when storing
the address in the 'lfb_base' field of 'struct screen_info'.

Add a 'ext_lfb_base' field for the upper 32-bits of the frame buffer
address and set VIDEO_TYPE_CAPABILITY_64BIT_BASE when the field is
useable.

It turns out that the reason no one has required this support so far
is that there's actually code in tianocore to "downgrade" PCI
resources that have option ROMs and 64-bit BARS from 64-bit to 32-bit
to cope with legacy option ROMs that can't handle 64-bit addresses.
The upshot is that basically all GOP devices in the wild use a 32-bit
frame buffer address.

Still, it is possible to build firmware that uses a full 64-bit GOP
frame buffer address. Chad did, which led to him reporting this issue.

Add support in anticipation of GOP devices using 64-bit addresses more
widely, and so that efifb works out of the box when that happens.

Reported-by: Chad Page <chad.page@znyx.com>
Cc: Pete Hawkins <pete.hawkins@znyx.com>
Acked-by: Peter Jones <pjones@redhat.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2015-10-12 14:20:06 +01:00
Leif Lindholm
7968c0e338 efi/arm64: Clean up efi_get_fdt_params() interface
As we now have a common debug infrastructure between core and arm64 efi,
drop the bit of the interface passing verbose output flags around.

Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2015-10-12 14:20:06 +01:00
Leif Lindholm
c9494dc818 arm64: Use core efi=debug instead of uefi_debug command line parameter
Now that we have an efi=debug command line option in the core code, use
this instead of the arm64-specific uefi_debug option.

Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2015-10-12 14:20:05 +01:00
Leif Lindholm
12dd00e83f efi/x86: Move efi=debug option parsing to core
fed6cefe3b ("x86/efi: Add a "debug" option to the efi= cmdline")
adds the DBG flag, but does so for x86 only. Move this early param
parsing to core code.

Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2015-10-12 14:20:05 +01:00
Ingo Molnar
cdbcd239e2 Merge branch 'x86/ras' into ras/core, to pick up changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 14:52:34 +02:00
Minfei Huang
e9c40d257f x86/kexec: Remove obsolete 'in_crash_kexec' flag
Previously, UV NMI used the 'in_crash_kexec' flag to determine whether
we are in a kdump kernel or not:

  5edd19af18 ("x86, UV: Make kdump avoid stack dumps")

But this flags was removed in the following commit:

  9c48f1c629 ("x86, nmi: Wire up NMI handlers to new routines")

Since it isn't used any more, remove it.

Signed-off-by: Minfei Huang <mnfhuang@gmail.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: cpw@sgi.com
Cc: kexec@lists.infradead.org
Cc: mhuang@redhat.com
Link: http://lkml.kernel.org/r/1444070155-17934-1-git-send-email-mhuang@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-12 09:43:11 +02:00
Gabriel Laskar
7e0abcd6b7 x86/mce: Include linux/ioctl.h in uapi mce header
asm/ioctls.h contains definition for termios, not just the _IO* macros.

This error was found with a tool in development used to generate
automated pretty-printing functions for ioctl decoding in strace.

Signed-off-by: Gabriel Laskar <gabriel@lse.epita.fr>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1444141657-14898-2-git-send-email-gabriel@lse.epita.fr
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-11 21:24:27 +02:00
Andy Shevchenko
4faefda97b x86/io_apic: Make eoi_ioapic_pin() static
We have to define internally used function as static, otherwise the following
warning will be generated:

arch/x86/kernel/apic/io_apic.c:532:6: warning: no previous prototype for 'eoi_ioapic_pin' [-Wmissing-prototypes]

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1444400685-98611-1-git-send-email-andriy.shevchenko@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-11 21:10:31 +02:00
Andy Shevchenko
d1f0f6c72c x86/intel-mid: Make intel_mid_ops static
The following warning is issued on unfixed code.

arch/x86/platform/intel-mid/intel-mid.c:64:22: warning: symbol 'intel_mid_ops' was not declared. Should it be static?

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: http://lkml.kernel.org/r/1444400741-98669-1-git-send-email-andriy.shevchenko@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-11 21:03:12 +02:00
Borislav Petkov
374a3a3916 x86/entry/64/compat: Document sysenter_fix_flags's reason for existence
The code under the label can normally be inline, without the
jumping back and forth but the latter is an optimization.

Document that.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20151009170859.GA24266@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-11 11:06:40 +02:00
Linus Torvalds
bbecce8d76 Merge git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:

 - MIPS didn't define the new ioremap_uc.  Defined it as an alias for
   ioremap_uncached.

 - Replace workaround for MIPS16 build issue with a correct one.

* git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Define ioremap_uc
  MIPS: UAPI: Ignore __arch_swab{16,32,64} when using MIPS16
  Revert "MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16."
2015-10-10 10:51:55 -07:00
Linus Torvalds
1d8a12d1de Merge branch 'stable/for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb fixlet from Konrad Rzeszutek Wilk:
 "Enable the SWIOTLB under 32-bit PAE kernels.

  Nowadays most distros enable this due to CONFIG_HYPERVISOR|XEN=y which
  select SWIOTLB.  But for those that are not interested in
  virtualization and wanting to use 32-bit PAE kernels and wanting to
  have working DMA operations - this configures it for them"

* 'stable/for-linus-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: Enable it under x86 PAE
2015-10-10 10:31:13 -07:00
Linus Torvalds
71419b7b84 Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull strscpy powerpc fix from Chris Metcalf.

Fix powerpc big-endian build.

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  arch/powerpc: provide zero_bytemask() for big-endian
2015-10-09 18:01:26 -07:00
Linus Torvalds
5163ac7637 ARM: SoC fixes for 4.3-rc4
The fixes for this week include one small patch that was years in the
 making and that finally fixes using all eight CPUs on exynos542x.
 
 The rest are lots of minor changes for sunxi, imx, exynos and shmobile
 
 * fixing the minimum voltage for Allwinner A20
 * thermal boot issue on SMDK5250.
 * invalid clock used for FIMD IOMMU.
 * audio on Renesas r8a7790/r8a7791
 * invalid clock used for FIMD IOMMU
 * LEDs on exynos5422-odroidxu3-common
 * usb pin control for imx-rex
 * imx53: fix PMIC interrupt level
 * a Makefile typo
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVhbS1GCrR//JCVInAQLwiQ/+LeAAqODQKpLI9BhKSSa1YiPgs8zr/O+q
 6SpyNf969j2d67hhDbBNIrcNw7bdYElogq+2emrPnsJ/o/GX18VFs+41s+zb2P1r
 dxmH9LA7wrUg05nc/TgyYXJZQa+JIZBymYJ6Kc9cdbkhmRZazAcV6POT4ZG5qfER
 QDwPGwBn/wXLMZ0yJnocUVexTn+GUdy0b7XRg141PYtYHg+mA0EEPHqul1IyB/rV
 W5u9HoA86mWLH+8CEzl7RTCXEPga/+ScxqimDFMW7Ok6F+CkPnD7u5z92p8dU38T
 J0Dc/xSA9w+8Y4AQuN1qM7g5W/qNszozaBusshIMF+UK5dDEEwWpdpvRr4mLpqLS
 hohu7zUel3V5n846Rwkr181Hh9yn5V7MiJ0vjj5gYmYeteLs5Gar94I/vnd9BMrD
 7lJo0aTMcoQNIvf2i1SEfyhQW/YOdWiU452sxtzNFe/wJ/6hdQxx/qgBMA1Dxm7/
 s1+bQ3ndBa5qiiTcVg5XBAGnxe5Eo7lqHStDyJ6hy3v8nt5ew1iPbBt8XEwHonDC
 8whzRKQMI70hz5nQoMLjEwiGhT3yFQu2IyrFD2yPldq2i4VC2iZybWianKa5BXlu
 16Easzhk05uZu340+tqCxxGwTaVSjNcJ+HRHRvW4cw6sReCeUPOtlnzlOGRufZpO
 pi2gCB3aTnY=
 =IsHY
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "The fixes for this week include one small patch that was years in the
  making and that finally fixes using all eight CPUs on exynos542x.

  The rest are lots of minor changes for sunxi, imx, exynos and shmobile

   - fixing the minimum voltage for Allwinner A20
   - thermal boot issue on SMDK5250.
   - invalid clock used for FIMD IOMMU.
   - audio on Renesas r8a7790/r8a7791
   - invalid clock used for FIMD IOMMU
   - LEDs on exynos5422-odroidxu3-common
   - usb pin control for imx-rex
   - imx53: fix PMIC interrupt level
   - a Makefile typo"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: dts: Fix wrong clock binding for sysmmu_fimd1_1 on exynos5420
  ARM: dts: Fix bootup thermal issue on smdk5250
  ARM: shmobile: r8a7791 dtsi: Add CPG/MSTP Clock Domain for sound
  ARM: shmobile: r8a7790 dtsi: Add CPG/MSTP Clock Domain for sound
  arm-cci500: Don't enable PMU driver by default
  ARM: dts: fix usb pin control for imx-rex dts
  ARM: imx53: qsrb: fix PMIC interrupt level
  ARM: imx53: include IRQ dt-bindings header
  ARM: dts: add suspend opp to exynos4412
  ARM: dts: Fix LEDs on exynos5422-odroidxu3
  ARM: EXYNOS: reset Little cores when cpu is up
  ARM: dts: Fix Makefile target for sun4i-a10-itead-iteaduino-plus
  ARM: dts: sunxi: Raise minimum CPU voltage for sun7i-a20 to meet SoC specifications
2015-10-09 15:54:14 -07:00
Jean-Philippe Brucker
4f64cb65bf arm/arm64: KVM: Only allow 64bit hosts to build VGICv3
Hardware virtualisation of GICv3 is only supported by 64bit hosts for
the moment. Some VGICv3 bits are missing from the 32bit side, and this
patch allows to still be able to build 32bit hosts when CONFIG_ARM_GIC_V3
is selected.

To this end, we introduce a new option, CONFIG_KVM_ARM_VGIC_V3, that is
only enabled on the 64bit side. The selection is done unconditionally
because CONFIG_ARM_GIC_V3 is always enabled on arm64.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 23:11:57 +01:00
Jean-Philippe Brucker
0b28f1db3d ARM: virt: select ARM_GIC_V3
This patch allows ARM guests to use GICv3 on an arm64 host

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 23:11:56 +01:00
Jean-Philippe Brucker
d5cd50d318 ARM: add 32bit support to GICv3
Implement the system and memory-mapped register accesses in
asm/arch_gicv3.h for 32bit architectures.

This patch is a straightforward translation of the arm64 header. 64bit
accesses are done in two times and don't need atomicity: TYPER is
read-only, and the upper-word of IROUTER is always zero on 32bit
architectures.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 23:11:55 +01:00
Jean-Philippe Brucker
72c971262f irqchip/gic-v3: Specialize readq and writeq accesses
On 32bit platforms, we cannot assure that an I/O ldrd or strd will be
done atomically. Besides, an hypervisor would be unable to emulate such
accesses.
In order to allow the AArch32 version of the driver to split them into
two 32bit accesses while keeping the requirement for atomic writes, this
patch specializes the IROUTER and TYPER accesses.
Since the latter is an ID register, it won't need to be read atomically,
but we still avoid future confusion by using gic_read_typer instead of a
generic gic_readq.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 23:11:53 +01:00
Jean-Philippe Brucker
f6c86a41e1 irqchip/gic-v3: Change unsigned types for AArch32 compatibility
This patch does a few simple compatibility-related changes:
- change the system register access prototypes to their actual size,
- homogenise mpidr accesses with unsigned long,
- force the 64bit register values to unsigned long long.

Note: the list registers are 64bit on GICv3, but the AArch32 vGIC driver
will need to split their values into two 32bit registers: LRn and LRCn.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 23:11:52 +01:00
Jean-Philippe Brucker
7936e914f7 irqchip/gic-v3: Refactor the arm64 specific parts
This patch moves the GICv3 system register access helpers to
arch/arm64/. Their 32bit counterparts will need to use mrc/mcr accesses
instead of mrs_s/msr_s.

[maz: fixed conflict with Cavium erratum handling]
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 23:11:50 +01:00
Marc Zyngier
963fcd4095 arm64: cpufeatures: Check ICC_EL1_SRE.SRE before enabling ARM64_HAS_SYSREG_GIC_CPUIF
As the firmware (or the hypervisor) may have disabled SRE access,
check that SRE can actually be enabled before declaring that we
do have that capability.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 22:16:54 +01:00
Marc Zyngier
d271976dbb arm64: el2_setup: Make sure ICC_SRE_EL2.SRE sticks before using GICv3 sysregs
Contrary to what was originally expected, EL3 firmware can (for whatever
reason) disable GICv3 system register access. In this case, the kernel
explodes very early.

Work around this by testing if the SRE bit sticks or not. If it doesn't,
abort the GICv3 setup, and pray that the firmware has passed a DT that
doesn't contain a GICv3 node.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-10-09 22:16:51 +01:00
Sarbojit Ganguly
e8973a889e ARM: 8443/1: Adding support for atomic half word exchange
Since support for half-word atomic exchange was not there and Qspinlock
on ARM requires it, modified __xchg() to add support for that as well.
ARMv6 and lower does not support ldrex{b,h} so, added a guard code to
prevent build breaks.

Signed-off-by: Sarbojit Ganguly <ganguly.s@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-09 16:22:54 +01:00
Russell King
e1b8c05dcc ARM: clean up TWD after previous patch
Rename feat_c3stop to twd_features to match the other variables in this
file.  Initialise it with the standard features that we always support,
and arrange to set the CLOCK_EVT_FEAT_C3STOP when appropriate.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-09 16:22:53 +01:00
Marc Gonzalez
194444c52e ARM: 8441/2: twd: Don't set CLOCK_EVT_FEAT_C3STOP unconditionally
In 5388a6b266 ("ARM: SMP: Always enable clock event broadcast support")
Russell noted that "the TWD local timers are unable to wake up the CPU
when it is placed into a low power mode".

However, some platforms do not stop the TWD block in low-power mode,
and can thus use the TWD timer in one-shot mode, without setting up
a broadcast device.

Make the driver check for the "always-on" boolean property, and set
the CLOCK_EVT_FEAT_C3STOP flag accordingly.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-09 16:22:53 +01:00
Vineet Gupta
24830fc782 ARC: mm: Introduce PTE_SPECIAL
Needed for THP, but will also come in handy for fast GUP later

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:23 +05:30
Vineet Gupta
129cbed54a ARC: mm: pte flags comsetic cleanups, comments
No semantical changes

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:22 +05:30
Vineet Gupta
e8a75963a4 ARC: mm: switch pgtable_to to pte_t *
ARC is the only arch with unsigned long type (vs. struct page *).
Historically this was done to avoid the page_address() calls in various
arch hooks which need to get the virtual/logical address of the table.

Some arches alternately define it as pte_t *, and is as efficient as
unsigned long (generated code doesn't change)

Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:22 +05:30
Andy Lutomirski
f5e6a9753a x86/entry: Split and inline syscall_return_slowpath()
GCC is unable to properly optimize functions that have a very
short likely case and a longer and register-heavier cold part --
it fails to sink all of the register saving and stack frame
setup code into the unlikely part.

Help it out with syscall_return_slowpath() by splitting it into
two parts and inline the hot part.

Saves 6 cycles for compat syscalls.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/0f773a894ab15c589ac794c2d34ca6ba9b5335c9.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:13 +02:00
Andy Lutomirski
39b48e575e x86/entry: Split and inline prepare_exit_to_usermode()
GCC is unable to properly optimize functions that have a very
short likely case and a longer and register-heavier cold part --
it fails to sink all of the register saving and stack frame
setup code into the unlikely part.

Help it out with prepare_exit_to_usermode() by splitting it into
two parts and inline the hot part.

Saves 6-8 cycles for compat syscalls.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/9fc53eda4a5b924070952f12fa4ae3e477640a07.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:13 +02:00
Andy Lutomirski
dd636071c3 x86/entry: Use pt_regs_to_thread_info() in syscall entry tracing
It generates simpler and faster code than current_thread_info().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/a3b6633e7dcb9f673c1b619afae602d29d27d2cf.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:12 +02:00
Andy Lutomirski
4aabd140f9 x86/entry: Hide two syscall entry assertions behind CONFIG_DEBUG_ENTRY
This shaves a few cycles off the slow paths.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/ce383fa9e129286ce6da6e00b53acd4c9fb5d06a.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:12 +02:00
Andy Lutomirski
c68ca6787b x86/entry: Micro-optimize compat fast syscall arg fetch
We're following a 32-bit pointer, and the uaccess code isn't
smart enough to figure out that the access_ok() check isn't
needed.

This saves about three cycles on a cache-hot fast syscall.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/bdff034e2f23c5eb974c760cf494cb5bddce8f29.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:12 +02:00
Andy Lutomirski
33c52129f4 x86/entry: Force inlining of 32-bit syscall code
On systems that support fast syscalls, we only really care about
the performance of the fast syscall path.  Forcibly inline it
and add a likely annotation.

This saves 4-6 cycles.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/8472036ff1f4b426b4c4c3e3d0b3bf5264407c0c.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:12 +02:00
Andy Lutomirski
460d12453e x86/entry: Make irqs_disabled checks in exit code depend on lockdep
These checks are quite slow.  Disable them in non-lockdep
kernels to reduce the performance hit.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/eccff2a154ae6fb50f40228901003a6e9c24f3d0.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:11 +02:00
Andy Lutomirski
8b13c2552f x86/entry: Remove unnecessary IRQ twiddling in fast 32-bit syscalls
This is slightly messy, but it eliminates an unnecessary cli;sti
pair.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/22f34b1096694a37326f36c53407b8dd90f37948.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:11 +02:00
Andy Lutomirski
487e3bf4f7 x86/asm: Remove thread_info.sysenter_return
It's no longer needed.

We could reinstate something like it as an optimization, which
would remove two cachelines from the fast syscall entry working
set.  I benchmarked it, and it makes no difference whatsoever to
the performance of cache-hot compat syscalls on Sandy Bridge.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/f08cc0cff30201afe9bb565c47134c0a6c1a96a2.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:11 +02:00
Andy Lutomirski
5f310f739b x86/entry/32: Re-implement SYSENTER using the new C path
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/5b99659e8be70f3dd10cd8970a5c90293d9ad9a7.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:10 +02:00
Andy Lutomirski
150ac78d63 x86/entry/32: Switch INT80 to the new C syscall path
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/a7e8d8df96838eae3208dd0441023f3ce7a81831.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:10 +02:00
Andy Lutomirski
39e8701f33 x86/entry/32: Open-code return tracking from fork and kthreads
syscall_exit is going away, and return tracing is just a
function call now, so open-code the two non-syscall 32-bit
users.

While we're at it, update the big register layout comment.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/a6b3c472fda7cda0e368c3ccd553dea7447dfdd2.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:10 +02:00
Andy Lutomirski
7841b40871 x86/entry/compat: Implement opportunistic SYSRETL for compat syscalls
If CS, SS and IP are as expected and FLAGS is compatible with
SYSRETL, then return from fast compat syscalls (both SYSCALL and
SYSENTER) using SYSRETL.

Unlike native 64-bit opportunistic SYSRET, this is not invisible
to user code: RCX and R8-R15 end up in a different state than
shown saved in pt_regs.  To compensate, we only do this when
returning to the vDSO fast syscall return path.  This won't
interfere with syscall restart, as we won't use SYSRETL when
returning to the INT80 restart instruction.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/aa15e49db33773eb10b73d73466b6d5466d7856a.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:10 +02:00
Andy Lutomirski
a474e67c91 x86/vdso/compat: Wire up SYSENTER and SYSCSALL for compat userspace
What, you didn't realize that SYSENTER and SYSCALL were actually
the same thing? :)

Unlike the old code, this actually passes the ptrace_syscall_32
test on AMD systems.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/b74615af58d785aa02d917213ec64e2022a2c796.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:09 +02:00
Andy Lutomirski
710246df58 x86/entry: Add C code for fast system call entries
This handles both SYSENTER and SYSCALL.  The asm glue will take
care of the differences.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/6041a58a9b8ef6d2522ab4350deb1a1945eb563f.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:09 +02:00
Andy Lutomirski
ee08c6bd31 x86/entry/64/compat: Migrate the body of the syscall entry to C
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/a2f0fce68feeba798a24339b5a7ec1ec2dd9eaf7.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:09 +02:00
Andy Lutomirski
bd2d3a3ba6 x86/entry: Add do_syscall_32(), a C function to do 32-bit syscalls
System calls are really quite simple.  Add a helper to call
a 32-bit system call.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/a77ed179834c27da436fb4a7fb23c8ee77abc11c.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:08 +02:00
Andy Lutomirski
eb974c6256 x86/syscalls: Give sys_call_ptr_t a useful type
Syscalls are asmlinkage functions (on 32-bit kernels), take six
args of type unsigned long, and return long.  Note that uml
could probably be slightly cleaned up on top of this patch.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/4d3ecc4a169388d47009175408b2961961744e6f.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:08 +02:00
Andy Lutomirski
034042cc1e x86/entry/syscalls: Move syscall table declarations into asm/syscalls.h
The header was missing some compat declarations.

Also make sys_call_ptr_t have a consistent type.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/3166aaff0fb43897998fcb6ef92991533f8c5c6c.1444091585.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:08 +02:00
Andy Lutomirski
8169aff611 x86/entry/64/compat: Set up full pt_regs for all compat syscalls
This is conceptually simpler.  More importantly, it eliminates
the PTREGSCALL and execve stubs, which were not compatible with
the C ABI.  This means that C code can call through the compat
syscall table.

The execve stubs are a bit subtle.  They did two things: they
cleared some registers and they forced slow-path return.
Neither is necessary any more: elf_common_init clears the extra
registers and start_thread calls force_iret().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/f95b7f7dfaacf88a8cae85bb06226cae53769287.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:07 +02:00
Andy Lutomirski
2ec67971fa x86/entry/64/compat: Remove most of the fast system call machinery
We now have only one code path that calls through the compat
syscall table.  This will make it much more pleasant to change
the pt_regs vs register calling convention, which we need to do
to move the call into C.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/320cda5573cefdc601b955d23fbe8f36c085432d.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:07 +02:00
Andy Lutomirski
c5f638ac90 x86/entry/64/compat: Remove audit optimizations
These audit optimizations are messy and hard to maintain.  We'll
get a similar effect from opportunistic sysret when fast compat
system calls are re-implemented.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/0bcca79ac7ff835d0e5a38725298865b01347a82.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:07 +02:00
Andy Lutomirski
e62a254a1f x86/entry/64/compat: Disable SYSENTER and SYSCALL32 entries
We've disabled the vDSO helpers to call them, so turn off the
entries entirely (temporarily) in preparation for cleaning them
up.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/8d6e84bf651519289dc532dcc230adfabbd2a3eb.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:07 +02:00
Andy Lutomirski
8242c6c84a x86/vdso/32: Save extra registers in the INT80 vsyscall path
The goal is to integrate the SYSENTER and SYSCALL32 entry paths
with the INT80 path.  SYSENTER clobbers ESP and EIP.  SYSCALL32
clobbers ECX (and, invisibly, R11).  SYSRETL (long mode to
compat mode) clobbers ECX and, invisibly, R11.  SYSEXIT (which
we only need for native 32-bit) clobbers ECX and EDX.

This means that we'll need to provide ESP to the kernel in a
register (I chose ECX, since it's only needed for SYSENTER) and
we need to provide the args that normally live in ECX and EDX in
memory.

The epilogue needs to restore ECX and EDX, since user code
relies on regs being preserved.

We don't need to do anything special about EIP, since the kernel
already knows where we are.  The kernel will eventually need to
know where int $0x80 lands, so add a vdso_image entry for it.

The only user-visible effect of this code is that ptrace-induced
changes to ECX and EDX during fast syscalls will be lost.  This
is already the case for the SYSENTER path.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/b860925adbee2d2627a0671fbfe23a7fd04127f8.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:06 +02:00
Andy Lutomirski
7bcdea4d05 x86/elf/64: Clear more registers in elf_common_init()
Before we start calling execve in contexts that honor the full
pt_regs, we need to teach it to initialize all registers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/65a38a9edee61a1158cfd230800c61dbd963dac5.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:06 +02:00
Andy Lutomirski
29c0ce9508 x86/vdso: Replace hex int80 CFI annotations with GAS directives
Maintaining the current CFI annotations written in R'lyehian is
difficult for most of us.  Translate them to something a little
closer to English.

This will remove the CFI data for kernels built with extremely
old versions of binutils.  I think this is a fair tradeoff for
the ability for mortals to edit the asm.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/ae3ff4ff5278b4bfc1e1dab368823469866d4b71.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:06 +02:00
Andy Lutomirski
f24f910884 x86/vdso: Define BUILD_VDSO while building and emit .eh_frame in asm
For the vDSO, user code wants runtime unwind info.  Make sure
that, if we use .cfi directives, we generate it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/16e29ad8855e6508197000d8c41f56adb00d7580.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:05 +02:00
Andy Lutomirski
7b956f035a x86/asm: Re-add parts of the manual CFI infrastructure
Commit:

  131484c8da ("x86/debug: Remove perpetually broken, unmaintainable dwarf annotations")

removed all the manual DWARF annotations outside the vDSO.  It also removed
the macros we used for the manual annotations.

Re-add these macros so that we can clean up the vDSO annotations.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/4c70bb98a8b773c8ccfaabf6745e569ff43e7f65.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-09 09:41:05 +02:00
Daniel Axtens
f2dd80ecca powerpc/powernv: Panic on unhandled Machine Check
All unrecovered machine check errors on PowerNV should cause an
immediate panic. There are 2 reasons that this is the right policy:
it's not safe to continue, and we're already trying to reboot.

Firstly, if we go through the recovery process and do not successfully
recover, we can't be sure about the state of the machine, and it is
not safe to recover and proceed.

Linux knows about the following sources of Machine Check Errors:
- Uncorrectable Errors (UE)
- Effective - Real Address Translation (ERAT)
- Segment Lookaside Buffer (SLB)
- Translation Lookaside Buffer (TLB)
- Unknown/Unrecognised

In the SLB, TLB and ERAT cases, we can further categorise these as
parity errors, multihit errors or unknown/unrecognised.

We can handle SLB errors by flushing and reloading the SLB. We can
handle TLB and ERAT multihit errors by flushing the TLB. (It appears
we may not handle TLB and ERAT parity errors: I will investigate
further and send a followup patch if appropriate.)

This leaves us with uncorrectable errors. Uncorrectable errors are
usually the result of ECC memory detecting an error that it cannot
correct, but they also crop up in the context of PCI cards failing
during DMA writes, and during CAPI error events.

There are several types of UE, and there are 3 places a UE can occur:
Skiboot, the kernel, and userspace. For Skiboot errors, we have the
facility to make some recoverable. For userspace, we can simply kill
(SIGBUS) the affected process. We have no meaningful way to deal with
UEs in kernel space or in unrecoverable sections of Skiboot.

Currently, these unrecovered UEs fall through to
machine_check_expection() in traps.c, which calls die(), which OOPSes
and sends SIGBUS to the process. This sometimes allows us to stumble
onwards. For example we've seen UEs kill the kernel eehd and
khugepaged. However, the process killed could have held a lock, or it
could have been a more important process, etc: we can no longer make
any assertions about the state of the machine. Similarly if we see a
UE in skiboot (and again we've seen this happen), we're not in a
position where we can make any assertions about the state of the
machine.

Likewise, for unknown or unrecognised errors, we're not able to say
anything about the state of the machine.

Therefore, if we have an unrecovered MCE, the most appropriate thing
to do is to panic.

The second reason is that since e784b6499d ("powerpc/powernv: Invoke
opal_cec_reboot2() on unrecoverable machine check errors."), we
attempt a special OPAL reboot on an unhandled MCE. This is so the
hardware can record error data for later debugging.

The comments in that commit assert that we are heading down the panic
path anyway. At the moment this is not always true. With UEs in kernel
space, for instance, they are marked as recoverable by the hardware,
so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't
panic() but would simply die() and OOPS. It doesn't make sense to be
staggering on if we've just tried to reboot: we should panic().

Explicitly panic() on unrecovered MCEs on PowerNV.
Update the comments appropriately.

This fixes some hangs following EEH events on cxlflash setups.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-09 08:07:19 +11:00
Cyril Bur
fdf880a608 powerpc: Fix checkstop in native_hpte_clear() with lockdep
native_hpte_clear() is called in real mode from two places:
- Early in boot during htab initialisation if firmware assisted dump is
  active.
- Late in the kexec path.

In both contexts there is no need to disable interrupts are they are
already disabled. Furthermore, locking around the tlbie() is only required
for pre POWER5 hardware.

On POWER5 or newer hardware concurrent tlbie()s work as expected and on pre
POWER5 hardware concurrent tlbie()s could result in deadlock. This code
would only be executed at crashdump time, during which all bets are off,
concurrent tlbie()s are unlikely and taking locks is unsafe therefore the
best course of action is to simply do nothing. Concurrent tlbie()s are not
possible in the first case as secondary CPUs have not come up yet.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-09 08:01:38 +11:00
Chris Metcalf
7a5692e6e5 arch/powerpc: provide zero_bytemask() for big-endian
For some reason, only the little-endian flavor of
powerpc provided the zero_bytemask() implementation.

Reported-by: Michal Sojka <sojkam1@fel.cvut.cz>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-10-08 11:44:12 -04:00
Ben Hutchings
92b279070d crypto: camellia_aesni_avx - Fix CPU feature checks
We need to explicitly check the AVX and AES CPU features, as we can't
infer them from the related XSAVE feature flags.  For example, the
Core i3 2310M passes the XSAVE feature test but does not implement
AES-NI.

Reported-and-tested-by: Stéphane Glondu <glondu@debian.org>
References: https://bugs.debian.org/800934
Fixes: ce4f5f9b65 ("x86/fpu, crypto x86/camellia_aesni_avx: Simplify...")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable <stable@vger.kernel.org> # 4.2
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-10-08 21:36:49 +08:00
Dave Kleikamp
a66d7f724a crypto: sparc - initialize blkcipher.ivsize
Some of the crypto algorithms write to the initialization vector,
but no space has been allocated for it. This clobbers adjacent memory.

Cc: stable@vger.kernel.org
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2015-10-08 21:36:48 +08:00
Ingo Molnar
d3df65c198 Merge branch 'perf/urgent' into perf/core, before pulling new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-08 10:52:18 +02:00
Christian Melki
9d99c7123c swiotlb: Enable it under x86 PAE
Most distributions end up enabling SWIOTLB already with 32-bit
kernels due to the combination of CONFIG_HYPERVISOR_GUEST|CONFIG_XEN=y
as those end up requiring the SWIOTLB.

However for those that are not interested in virtualization and
run in 32-bit they will discover that: "32-bit PAE 4.2.0 kernel
(no IOMMU code) would hang when writing to my USB disk. The kernel
spews million(-ish messages per sec) to syslog, effectively
"hanging" userspace with my kernel.

Oct  2 14:33:06 voodoochild kernel: [  223.287447] nommu_map_sg:
overflow 25dcac000+1024 of device mask ffffffff
Oct  2 14:33:06 voodoochild kernel: [  223.287448] nommu_map_sg:
overflow 25dcac000+1024 of device mask ffffffff
Oct  2 14:33:06 voodoochild kernel: [  223.287449] nommu_map_sg:
overflow 25dcac000+1024 of device mask ffffffff
... etc ..."

Enabling it makes the problem go away.

N.B. With a6dfa128ce
"config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected"
we also have the important part of the SG macros enabled to make this
work properly - in case anybody wants to backport this patch.

Reported-and-Tested-by: Christian Melki <christian.melki@t2data.com>
Signed-off-by: Christian Melki <christian.melki@t2data.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2015-10-07 15:31:35 -04:00
Linus Torvalds
c6fa8e6de3 arm64 fixes for 4.3-rc5
- A couple of locking fixes for RT kernels
 - Avoid printing bogus initrd warnings when initrd isn't present
 - Performance fix for random mmap file readahead
 - Typo fix
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJWFO+aAAoJELescNyEwWM0z14IALpleyenZXl+xqxMjNyOXouG
 /2SbTZH7iD/vnfCL6G7/Poq00I2ghtBSRFGXajfg7V0mjH1HTfVXVN19IXaUUwjW
 IfAMqSyC43dDBdsGn3A1ZqPRNk+chxjwz7zimGqPowuM87C4aj/sqetBSnuybZtB
 lSYfCFGpDj8cpsJ0xwYYhuq8xUgixQMslTj+rVAFtfsLkDHUu175l+vP7t2xOv5v
 bmyPlz15O/v9febnLYFVFPWZB2IWfvaFkR30qUGsMe6BGWdGDe/RGUPksZDLlPdL
 Yj4AKq+9Bx0lPvO+vNEqfvScKdVjMpttVEMfi2cQ8kUbD1rRPZ7ZTHsfcEOuB1w=
 =Zi+I
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "This addresses a couple of issues found with RT, a broken initrd
  message in the console log and a simple performance fix for some MMC
  workloads.

  Summary:

   - A couple of locking fixes for RT kernels
   - Avoid printing bogus initrd warnings when initrd isn't present
   - Performance fix for random mmap file readahead
   - Typo fix"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: replace read_lock to rcu lock in call_break_hook
  arm64: Don't relocate non-existent initrd
  arm64: convert patch_lock to raw lock
  arm64: readahead: fault retry breaks mmap file read random detection
  arm64: debug: Fix typo in debug-monitors.c
2015-10-07 18:17:46 +01:00
Andy Lutomirski
0a6d1fa0d2 x86/vdso: Remove runtime 32-bit vDSO selection
32-bit userspace will now always see the same vDSO, which is
exactly what used to be the int80 vDSO.  Subsequent patches will
clean it up and make it support SYSENTER and SYSCALL using
alternatives.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/e7e6b3526fa442502e6125fe69486aab50813c32.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-07 11:34:08 +02:00
Andy Lutomirski
b611acf473 x86/entry/64/compat: After SYSENTER, move STI after the NT fixup
We eventually want to make it all the way into C code before
enabling interrupts.  We need to rework our flags handling
slightly to delay enabling interrupts.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/35d24d2a9305da3182eab7b2cdfd32902e90962c.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-07 11:34:08 +02:00
Andy Lutomirski
72f924783b x86/entry, locking/lockdep: Move lockdep_sys_exit() to prepare_exit_to_usermode()
Rather than worrying about exactly where LOCKDEP_SYS_EXIT should
go in the asm code, add it to prepare_exit_from_usermode() and
remove all of the asm calls that are followed by
prepare_exit_to_usermode().

LOCKDEP_SYS_EXIT now appears only in the syscall fast paths.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1736ebe948b845e68120b86b89091f3ec27f5e8e.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-07 11:34:07 +02:00
Andy Lutomirski
dd27f998f0 x86/entry/64/compat: Fix SYSENTER's NT flag before user memory access
Clearing NT is part of the prologue, whereas loading up arg6
makes more sense to think about as part of syscall processing.
Reorder them.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/19eb235828b2d2a52c53459e09f2974e15e65a35.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-07 11:34:07 +02:00
Andy Lutomirski
7e0f51cb44 x86/uaccess: Add unlikely() to __chk_range_not_ok() failure paths
This should improve code quality a bit. It also shrinks the kernel text:

 Before:
       text     data      bss       dec    filename
   21828379  5194760  1277952  28301091    vmlinux

 After:
       text     data      bss       dec    filename
   21827997  5194760  1277952  28300709    vmlinux

... by 382 bytes.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/f427b8002d932e5deab9055e0074bb4e7e80ee39.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-07 11:34:06 +02:00
Andy Lutomirski
a76cf66e94 x86/uaccess: Tell the compiler that uaccess is unlikely to fault
GCC doesn't realize that get_user(), put_user(), and their __
variants are unlikely to fail.  Tell it.

I noticed this while playing with the C entry code.

 Before:
       text     data      bss       dec    filename
   21828763  5194760  1277952  28301475    vmlinux.baseline

 After:
      text      data      bss       dec    filename
   21828379  5194760  1277952  28301091    vmlinux.new

The generated code shrunk by 384 bytes.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/dc37bed7024319c3004d950d57151fca6aeacf97.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-07 11:34:06 +02:00
Ingo Molnar
25a9a924c0 Merge branch 'linus' into x86/asm, to pick up fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-07 11:24:24 +02:00
Linus Torvalds
79c7c7acd2 Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull strscpy fixes from Chris Metcalf :
 "This patch series fixes up a couple of architecture issues where
  strscpy wasn't configured correctly (missing on h8300, duplicating
  local and asm-generic copies on powerpc and tile).

  It also adds a use of zero_bytemask() to the final store for strscpy
  to avoid writing uninitialized data to the destination.  However, to
  make this work we had to add support for zero_bytemask() to the two
  architectures that didn't have it (alpha and tile), because they were
  providing their own local copies, but didn't provide the
  zero_bytemask() that was previously only required when building with
  CONFIG_DCACHE_WORD_ACCESS"

[ Side note: there is still no actual users of strscpy except for the
  one preexisting use in arch/tile that predates the generic version.
  So this is all about fixing the infrastructure so that we eventually
  can start using it.  - Linus ]

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  strscpy: zero any trailing garbage bytes in the destination
  word-at-a-time.h: support zero_bytemask() on alpha and tile
  word-at-a-time.h: fix some Kbuild files
2015-10-07 09:52:42 +01:00
Chris Metcalf
c753bf34c9 word-at-a-time.h: support zero_bytemask() on alpha and tile
Both alpha and tile needed implementations of zero_bytemask.

The alpha version is untested.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-10-06 14:53:16 -04:00
Chris Metcalf
19c22f3a29 word-at-a-time.h: fix some Kbuild files
arch/tile added word-at-a-time.h after the patch that added generic-y
entries; the generic-y entry is now stale.

arch/h8300 is newer than the generic-y patch for word-at-a-time.h,
and needs a generic-y entry.

arch/powerpc seems to have gotten a generic-y entry by mistake in
the first patch; this change removes it.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-10-06 14:52:48 -04:00
Yang Shi
62c6c61adb arm64: replace read_lock to rcu lock in call_break_hook
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 0, irqs_disabled(): 128, pid: 342, name: perf
1 lock held by perf/342:
 #0:  (break_hook_lock){+.+...}, at: [<ffffffc0000851ac>] call_break_hook+0x34/0xd0
irq event stamp: 62224
hardirqs last  enabled at (62223): [<ffffffc00010b7bc>] __call_rcu.constprop.59+0x104/0x270
hardirqs last disabled at (62224): [<ffffffc0000fbe20>] vprintk_emit+0x68/0x640
softirqs last  enabled at (0): [<ffffffc000097928>] copy_process.part.8+0x428/0x17f8
softirqs last disabled at (0): [<          (null)>]           (null)
CPU: 0 PID: 342 Comm: perf Not tainted 4.1.6-rt5 #4
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffffffc000089968>] dump_backtrace+0x0/0x128
[<ffffffc000089ab0>] show_stack+0x20/0x30
[<ffffffc0007030d0>] dump_stack+0x7c/0xa0
[<ffffffc0000c878c>] ___might_sleep+0x174/0x260
[<ffffffc000708ac8>] __rt_spin_lock+0x28/0x40
[<ffffffc000708db0>] rt_read_lock+0x60/0x80
[<ffffffc0000851a8>] call_break_hook+0x30/0xd0
[<ffffffc000085a70>] brk_handler+0x30/0x98
[<ffffffc000082248>] do_debug_exception+0x50/0xb8
Exception stack(0xffffffc00514fe30 to 0xffffffc00514ff50)
fe20:                                     00000000 00000000 c1594680 0000007f
fe40: ffffffff ffffffff 92063940 0000007f 0550dcd8 ffffffc0 00000000 00000000
fe60: 0514fe70 ffffffc0 000be1f8 ffffffc0 0514feb0 ffffffc0 0008948c ffffffc0
fe80: 00000004 00000000 0514fed0 ffffffc0 ffffffff ffffffff 9282a948 0000007f
fea0: 00000000 00000000 9282b708 0000007f c1592820 0000007f 00083914 ffffffc0
fec0: 00000000 00000000 00000010 00000000 00000064 00000000 00000001 00000000
fee0: 005101e0 00000000 c1594680 0000007f c1594740 0000007f ffffffd8 ffffff80
ff00: 00000000 00000000 00000000 00000000 c1594770 0000007f c1594770 0000007f
ff20: 00665e10 00000000 7f7f7f7f 7f7f7f7f 01010101 01010101 00000000 00000000
ff40: 928e4cc0 0000007f 91ff11e8 0000007f

call_break_hook is called in atomic context (hard irq disabled), so replace
the sleepable lock to rcu lock, replace relevant list operations to rcu
version and call synchronize_rcu() in unregister_break_hook().

And, replace write lock to spinlock in {un}register_break_hook.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-06 19:10:28 +01:00
Mark Rutland
4ca3bc86be arm64: Don't relocate non-existent initrd
When booting a kernel without an initrd, the kernel reports that it
moves -1 bytes worth, having gone through the motions with initrd_start
equal to initrd_end:

    Moving initrd from [4080000000-407fffffff] to [9fff49000-9fff48fff]

Prevent this by bailing out early when the initrd size is zero (i.e. we
have no initrd), avoiding the confusing message and other associated
work.

Fixes: 1570f0d7ab ("arm64: support initrd outside kernel linear map")
Cc: Mark Salter <msalter@redhat.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-06 18:33:15 +01:00
Taku Izumi
712df65ccb perf/x86/intel/uncore: Fix multi-segment problem of perf_event_intel_uncore
In multi-segment system, uncore devices may belong to buses whose segment
number is other than 0:

  ....
  0000:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
  ...
  0001:7f:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
  ...
  0001:bf:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03)
  ...
  0001:ff:10.5 System peripheral: Intel Corporation Xeon E5 v3/Core i7 Scratchpad & Semaphore Registers (rev 03
  ...

In that case, relation of bus number and physical id may be broken
because "uncore_pcibus_to_physid" doesn't take account of PCI segment.
For example, bus 0000:ff and 0001:ff uses the same entry of
"uncore_pcibus_to_physid" array.

This patch fixes this problem by introducing the segment-aware pci2phy_map instead.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1443096621-4119-1-git-send-email-izumi.taku@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 17:31:51 +02:00
Kan Liang
7ce1346a68 perf/x86: Add Intel cstate PMUs support
This patch adds new PMUs to support cstate related free running
(read-only) counters. These counters may be used simultaneously by other
tools, such as turbostat. However, it still make sense to implement them
in perf. Because we can conveniently collect them together with other
events, and allow to use them from tools without special MSR access
code.

These counters include CORE_C*_RESIDENCY and PKG_C*_RESIDENCY.
According to counters' scope and category, two PMUs are registered with
the perf_event core subsystem.

 - 'cstate_core': The counter is available for each physical core. The
                  counters include CORE_C*_RESIDENCY.

 - 'cstate_pkg':  The counter is available for each physical package. The
                  counters include PKG_C*_RESIDENCY.

The events are exposed in sysfs for use by perf stat and other tools.
The files are:

  /sys/devices/cstate_core/events/c*-residency
  /sys/devices/cstate_pkg/events/c*-residency

These events only support system-wide mode counting.
The /sys/devices/cstate_*/cpumask file can be used by tools to figure
out which CPUs to monitor by default.

The PMU type (attr->type) is dynamically allocated and is available from
/sys/devices/core_misc/type and /sys/device/cstate_*/type.

Sampling is not supported.

Here is an example.

 - To caculate the fraction of time when the core is running in C6 state
   CORE_C6_time% = CORE_C6_RESIDENCY / TSC

 # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 sleep 5

   11838820015,,cstate_core/c6-residency/,5175919658,100.00
   11877130740,,msr/tsc/,5175922010,100.00

 For sleep, 99.7% of time we ran in C6 state.

 # perf stat -x, -e"cstate_core/c6-residency/,msr/tsc/" -C0 -- taskset -c 0 busyloop

   1253316,,cstate_core/c6-residency/,4360969154,100.00
   10012635248,,msr/tsc/,4360972366,100.00

 For busyloop, 0.01% of time we ran in C6 state.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1443443404-8581-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 17:31:51 +02:00
Ingo Molnar
82fc167c39 Linux 4.3-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWEUxnAAoJEHm+PkMAQRiGYCYH/3gtGkFdvSLi+E1PfI8Qk3ZA
 XuYA4Mj09JBVSmaICeueMTDVrdiq0OE0zPib26GWlF/za13kNU8KgMR3+6XCuLSX
 DiCmh6mwDItoNoSIIUERLqrFHABXz8rZ3gb3uu2+kNN74Cl0piNm1YpFclEEWjMr
 9Wk5fkq+ontnDVUQOvWUxPiUXOJTvdLXBWTRDw1yTdE3RMNwRI2d/hme6Hq++WYV
 tRalZZKQaoB33js9WRVAoLVunvtna+i+/y7VGLj8QyS0+d6ec81Hey2r1/fR/oG4
 bs4ul6vtqeb3IR/PjUqxF59pSrCLEO+qrp9KrTlJNYgr1m1QyjRxWUdy/XhyaWo=
 =gIhN
 -----END PGP SIGNATURE-----

Merge tag 'v4.3-rc4' into locking/core, to pick up fixes before applying new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 17:10:28 +02:00
Peter Zijlstra
d87b7a3379 sched/core, sched/x86: Kill thread_info::saved_preempt_count
With the introduction of the context switch preempt_count invariant,
and the demise of PREEMPT_ACTIVE, its pointless to save/restore the
per-cpu preemption count, it must always be 2.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 17:08:18 +02:00
Peter Zijlstra
609ca06638 sched/core: Create preempt_count invariant
Assuming units of PREEMPT_DISABLE_OFFSET for preempt_count() numbers.

Now that TASK_DEAD no longer results in preempt_count() == 3 during
scheduling, we will always call context_switch() with preempt_count()
== 2.

However, we don't always end up with preempt_count() == 2 in
finish_task_switch() because new tasks get created with
preempt_count() == 1.

Create FORK_PREEMPT_COUNT and set it to 2 and use that in the right
places. Note that we cannot use INIT_PREEMPT_COUNT as that serves
another purpose (boot).

After this, preempt_count() is invariant across the context switch,
with exception of PREEMPT_ACTIVE.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 17:08:14 +02:00
Florian Fainelli
d836ace65e ARM: orion: Fix DSA platform device after mvmdio conversion
DSA expects the host_dev pointer to be the device structure associated
with the MDIO bus controller driver. First commit breaking that was
c3a07134e6 ("mv643xx_eth: convert to use the Marvell Orion MDIO
driver"), and then, it got completely under the radar for a while.

Reported-by: Frans van de Wiel <fvdw@fvdw.eu>
Fixes: c3a07134e6 ("mv643xx_eth: convert to use the Marvell Orion MDIO driver")
CC: stable@vger.kernel.org
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2015-10-06 16:22:37 +02:00
Linus Torvalds
f6702681a0 xen: bug fixes for 4.3-rc4
- Fix VM save performance regression with x86 PV guests.
 - Make kexec work in x86 PVHVM guests (if Xen has the soft-reset ABI).
 - Other minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWE8wQAAoJEFxbo/MsZsTRVTMH/0eqSg2M78wv4sBl234Y3FE9
 AN8KFUdlkK7VN9v0uuLMDSKIWNUuFJIvo/2rElWGRiX2Q+/pfnQg3ZSFhub9S8uL
 T4LCvmG9viRFb2oUz792ewqncSw3X98Jpto4smA820gJRjndBSWm5HUKUtPAkv1M
 l5DFMEgOeHbu+wCbKD/ZPEt5K9GsIaNviSNoWtYHirZwrd00oLmNbWp+g8lIGQiT
 3vLW0SaZzjL6akKxihb/p3WZ9eNmyz8yk0V7dItUEVUB9qoaDDLJ5qIRSHHWTWQD
 Jza/GE32VallZLuEXGG5/D86MsnyVYHC+lZtwo2IptOGm8v7WuZRv094wI1ev5c=
 =aiDw
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

 - Fix VM save performance regression with x86 PV guests

 - Make kexec work in x86 PVHVM guests (if Xen has the soft-reset ABI)

 - Other minor fixes.

* tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen/p2m: hint at the last populated P2M entry
  x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map
  x86/xen: Support kexec/kdump in HVM guests by doing a soft reset
  xen/x86: Don't try to write syscall-related MSRs for PV guests
  xen: use correct type for HYPERVISOR_memory_op()
2015-10-06 15:05:02 +01:00
Linus Torvalds
3ec20e2e61 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Three bug fixes and an update to the default configuration"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/defconfig: set SCSI_DH=y
  s390/vtime: correct scaled cputime of partially idle CPUs
  s390/boot/decompression: disable floating point in decompressor
  s390/numa: use correct type for node_to_cpumask_map
2015-10-06 14:59:36 +01:00
David Vrabel
98dd166ea3 x86/xen/p2m: hint at the last populated P2M entry
With commit 633d6f17cd (x86/xen: prepare
p2m list for memory hotplug) the P2M may be sized to accomdate a much
larger amount of memory than the domain currently has.

When saving a domain, the toolstack must scan all the P2M looking for
populated pages.  This results in a performance regression due to the
unnecessary scanning.

Instead of reporting (via shared_info) the maximum possible size of
the P2M, hint at the last PFN which might be populated.  This hint is
increased as new leaves are added to the P2M (in the expectation that
they will be used for populated entries).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: <stable@vger.kernel.org> # 4.0+
2015-10-06 13:54:20 +01:00
Arnd Bergmann
5a37b15378 Renesas ARM Based SoC Fixes for v4.3
* Add Add CPG/MSTP Clock Domain for sound on r8a779[01] SoCs.
   This allows sound to work once again.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWDesJAAoJENfPZGlqN0++nEsQALFAhpnTg2lkZgtzQHQh73y9
 V7yNAsUhH/dBGw5Efj7JhxYR1L5DTsVncEPMuDDFMYlGgwtAI8bSFpzpRf4+N69n
 TQItRnr6VbkQT/TsRbKIhOgIgERwaylcIEaP64THyIlCCXQG3zH/9MSiR5d7Q4Ac
 fGJT6HoNGCvysCDS9KJrgHDQP1TotvmxjjDJxYwWWrg1gPlE9jBiPvMP5quqnn6X
 VIn0zW4IhMG8Ajkb92+6XSXnFPlW0QeQA+YNOXK8za4TS/OrZpSQN3H3s9fl41Vh
 MvADUPoPHkka47t4IaR637d85epfPwKH2ignyl3BKe87GwhRHK9tKpsyqvbIcFus
 gsUMASBW5S0kYRxPXCAaLfA/Mj8CRR+yklpdyqftGBMCTNcOOcMsTo4/GdQr44q7
 0brGsVkKD7PGV/X9qjRV76g99svLnH6v3gkvEOGwn+O0e6e8egCKehaSDfDB3/nW
 RZRqyBH/ddnbaKxPAKyN0Bwd2N/yHEHOstgh+vZb8KX1wp3rrlsoqytAuxPuT0o6
 /vTE048dlEkBW/JFWd2cFB7ObUFyeWLOyYynwPNUi2KO/WHH+QYaaEReyopbjrLS
 OvepmiYqmw3I81HIBsd0lVAAYpqmYK/jjq3HmOib0YNuJiT382r4zsuOkMO3+kfQ
 vwAMpzC9sVfF0GVB4khD
 =0wuy
 -----END PGP SIGNATURE-----

Merge tag 'renesas-fixes-for-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes

Merge "Renesas ARM Based SoC Fixes for v4.3" from Simon Horman

* Add Add CPG/MSTP Clock Domain for sound on r8a779[01] SoCs.
  This allows sound to work once again.

* tag 'renesas-fixes-for-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
  ARM: shmobile: r8a7791 dtsi: Add CPG/MSTP Clock Domain for sound
  ARM: shmobile: r8a7790 dtsi: Add CPG/MSTP Clock Domain for sound
2015-10-06 14:31:53 +02:00
Arnd Bergmann
b0d58113e5 Allwinner fixes for 4.3
Two patches, one that fixes one of the DT build, and the other raising the
 voltage of the lowest OPP of the A20 to remain within the SoC operating
 boundaries
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWESymAAoJEBx+YmzsjxAgIXAP/1l4jFwcZqNwpmxHJmpMFnnN
 VSa+SXMGbJ9BIBR5dziU137yMvYCwDqVuvopnW6OpkiwtXwEFb4uRpb3Ma49IMw/
 sorOBVFc6EO1Z7MZ+MJc0r+nca6BGn9By10nIAQK1V0q5Bw6vBPM6ivuQE81r2RF
 bhrmzl3wjjO4H7DQtDr4N0qp5iZ9OSh0ZnZbXzwlqj29MlWIS7FaIwdvyOcz++nB
 Q8SI3G4oF27WtQlnfIh0XdU+2IvPQ4HIYSrA/72JLsGy3L/OsQPdIcehFHPG7wvu
 x89PF1ADtBbSqAwTlINyXtnZBVYdJh+z3jmx9gTNXyMuazDh2ar4QFQz1g60p5gd
 ZYwlY+1onkZeF0dKhFQ+1IlJgYN5jGykGvl2oIlYSueU347Si6d0IhELnUnP4xHy
 tx+Xbi6FpkfcQyMCD1M9KfbC5u1sxyXcKDvmtK50H1r64j2hVZZ4KqlUondnHJfP
 Dw99K0GEcPtpek0ly10h+qyObui3Y2U3Tjt5cY2AUodTFQm+mkNNRVD1+YTxMsmG
 RKoXQ1wFoGoteCUWy0V2bu7Pq1x1aYVn8Ov7pvHqn2v9YND8MGodjadzSReuS0Dp
 +mSrvdBYuEzQ0MSq81Iih9pi5g9xy3Keklnt6+5SH7dCkaRENBVOl/dxBwWFIY45
 fuPg1x2L4XSHBkQ7KkG/
 =4XBA
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-fixes-for-4.3' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes

Merge "Allwinner fixes for 4.3" from Maxime Ripard:

Two patches, one that fixes one of the DT build, and the other raising the
voltage of the lowest OPP of the A20 to remain within the SoC operating
boundaries

* tag 'sunxi-fixes-for-4.3' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
  ARM: dts: Fix Makefile target for sun4i-a10-itead-iteaduino-plus
  ARM: dts: sunxi: Raise minimum CPU voltage for sun7i-a20 to meet SoC specifications
2015-10-06 14:30:14 +02:00
Arnd Bergmann
c6722ddce0 Samsung fixes for v4.3
- fix invalid clock used for FIMD IOMMU
 - fix thermal boot issue smdk5250-smdk5250
 - fix S2R on exynos4412 trats2 boards
 - fix LEDs on exynos5422-odroidxu3-common
 - fix booting of all 8 cores on exynos542x
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJWEv+CAAoJEA0Cl+kVi2xqx1EP/iGzkPsr/uFikqFX+9gGQU/T
 I7xtPWlTH63edk67IUQynuXhz1LkVGdsFjgpevVXtD4c9G8+2qbKYj7nw3HSi7q3
 sXLkPZcmbA48vhTr15NkA0tI9qmJFbX8NJBm/aYMSd3Cteukep6QvgT7zzYC56/Q
 gWgSQC5IquwHghxzZs8OE1gzcLIEIC6qLam1rPATJ19JYT4SdR6jM/cl+CKICvnP
 p5ncYxY+TTJ9/zX+0wzF10rqb9xWlZqmakaQtlsAh60wZ0tvTGUGUf1cWZ820gR1
 XpeLDYKoO1xxRfhG9Sfzcb9tuvWNXSZkWUWa6aNCl1hqGrJ1waa1pnRqfWDWm54z
 Jnhw2FUB9keiIfgvfsopW7oHhpx34u6Hmnt2AeHRaxtjJqkV/t7XRHOOmyZiK3EK
 u4y4Nlek9lVTOiWu5le8KWxuefpWH9aot8BTOVEd9uKkCY5/2PiAZFR9IQXOk8+u
 Z8auHX4sVcPhcnkp6ggLeYhNQMveCbx7lg6sKVUJa2DmEaO+qcWxxo3MW9ETGFwO
 fLo4zEPmT9dfrSb9YoEUSPy3UBwOIaxmob8e3U5Cw3A5FyzpdP3ViLYXpz97tQDS
 yJDSVis6hFFjPqGxoQBW37WSmIPTX3Q/7eAMWJc85ZC5WzoYO+Z6RWkC/gaUU/yI
 V6FJivK8fLWndk67Dy3H
 =lpRb
 -----END PGP SIGNATURE-----

Merge tag 'samsung-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes

Merge "Samsung fixes for v4.3" from Kukjin Kim:

- fix invalid clock used for FIMD IOMMU
- fix thermal boot issue smdk5250-smdk5250
- fix S2R on exynos4412 trats2 boards
- fix LEDs on exynos5422-odroidxu3-common
- fix booting of all 8 cores on exynos542x

* tag 'samsung-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: dts: Fix wrong clock binding for sysmmu_fimd1_1 on exynos5420
  ARM: dts: Fix bootup thermal issue on smdk5250
  ARM: dts: add suspend opp to exynos4412
  ARM: dts: Fix LEDs on exynos5422-odroidxu3
  ARM: EXYNOS: reset Little cores when cpu is up
2015-10-06 14:26:32 +02:00
Ben Hutchings
da11f98fd0 MIPS: Define ioremap_uc
All architectures must now define ioremap_uc(), but MIPS currently
only has ioremap_nocache().

Fixes: 4c73e89266 ("arch/*/io.h: Add ioremap_uc() to all architectures")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11263/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-06 13:19:25 +02:00
Stephen Smalley
e1a58320a3 x86/mm: Warn on W^X mappings
Warn on any residual W+X mappings after setting NX
if DEBUG_WX is enabled.  Introduce a separate
X86_PTDUMP_CORE config that enables the code for
dumping the page tables without enabling the debugfs
interface, so that DEBUG_WX can be enabled without
exposing the debugfs interface.  Switch EFI_PGT_DUMP
to using X86_PTDUMP_CORE so that it also does not require
enabling the debugfs interface.

On success it prints this to the kernel log:

  x86/mm: Checked W+X mappings: passed, no W+X pages found.

On failure it prints a warning and a count of the failed pages:

  ------------[ cut here ]------------
  WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:226 note_page+0x610/0x7b0()
  x86/mm: Found insecure W+X mapping at address ffffffff81755000/__stop___ex_table+0xfa8/0xabfa8
  [...]
  Call Trace:
   [<ffffffff81380a5f>] dump_stack+0x44/0x55
   [<ffffffff8109d3f2>] warn_slowpath_common+0x82/0xc0
   [<ffffffff8109d48c>] warn_slowpath_fmt+0x5c/0x80
   [<ffffffff8106cfc9>] ? note_page+0x5c9/0x7b0
   [<ffffffff8106d010>] note_page+0x610/0x7b0
   [<ffffffff8106d409>] ptdump_walk_pgd_level_core+0x259/0x3c0
   [<ffffffff8106d5a7>] ptdump_walk_pgd_level_checkwx+0x17/0x20
   [<ffffffff81063905>] mark_rodata_ro+0xf5/0x100
   [<ffffffff817415a0>] ? rest_init+0x80/0x80
   [<ffffffff817415bd>] kernel_init+0x1d/0xe0
   [<ffffffff8174cd1f>] ret_from_fork+0x3f/0x70
   [<ffffffff817415a0>] ? rest_init+0x80/0x80
  ---[ end trace a1f23a1e42a2ac76 ]---
  x86/mm: Checked W+X mappings: FAILED, 171 W+X pages found.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1444064120-11450-1-git-send-email-sds@tycho.nsa.gov
[ Improved the Kconfig help text and made the new option default-y
  if CONFIG_DEBUG_RODATA=y, because it already found buggy mappings,
  so we really want people to have this on by default. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 11:11:48 +02:00
Ingo Molnar
38a413cbc2 Linux 4.3-rc3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWB9f6AAoJEHm+PkMAQRiGiFMIAJYFLIkF/dXFYMNPGsRGRGYO
 SsQkfYzjy4i/yloyVlGB33e6dqxWdVgCeqYC77TO+1CBq34o6dqM4PACTrhjtS+3
 qQvaP/qn6cSoaGIkdD3v43CCiwMpZZ5+Uj7F7Uz8N4twrpykOZFMM5T7f1lrsG2F
 wJGafmvok9NU2F2wYwaJ8JrzsF6iO6ibFeB8BosRF5Ba4nKqiXVI0xNa0R8PFDm3
 tbh/IkkqokemEqnHyWyszhGFsCQupi+QgsjY/LhWUcCaL7HLEgJmkBX0tXNlgMmK
 TFCq7L8Bigu4nlgZ/iVUB9kh4GTBNVcbdRVN3loJFlczFJlIAa171OVlfRu3lvU=
 =m29x
 -----END PGP SIGNATURE-----

Merge tag 'v4.3-rc3' into x86/mm, to pick up fixes before applying new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 10:56:54 +02:00
Yang Shi
abffa6f3b1 arm64: convert patch_lock to raw lock
When running kprobe test on arm64 rt kernel, it reports the below warning:

root@qemu7:~# modprobe kprobe_example
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 0, irqs_disabled(): 128, pid: 484, name: modprobe
CPU: 0 PID: 484 Comm: modprobe Not tainted 4.1.6-rt5 #2
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffffffc0000891b8>] dump_backtrace+0x0/0x128
[<ffffffc000089300>] show_stack+0x20/0x30
[<ffffffc00061dae8>] dump_stack+0x1c/0x28
[<ffffffc0000bbad0>] ___might_sleep+0x120/0x198
[<ffffffc0006223e8>] rt_spin_lock+0x28/0x40
[<ffffffc000622b30>] __aarch64_insn_write+0x28/0x78
[<ffffffc000622e48>] aarch64_insn_patch_text_nosync+0x18/0x48
[<ffffffc000622ee8>] aarch64_insn_patch_text_cb+0x70/0xa0
[<ffffffc000622f40>] aarch64_insn_patch_text_sync+0x28/0x48
[<ffffffc0006236e0>] arch_arm_kprobe+0x38/0x48
[<ffffffc00010e6f4>] arm_kprobe+0x34/0x50
[<ffffffc000110374>] register_kprobe+0x4cc/0x5b8
[<ffffffbffc002038>] kprobe_init+0x38/0x7c [kprobe_example]
[<ffffffc000084240>] do_one_initcall+0x90/0x1b0
[<ffffffc00061c498>] do_init_module+0x6c/0x1cc
[<ffffffc0000fd0c0>] load_module+0x17f8/0x1db0
[<ffffffc0000fd8cc>] SyS_finit_module+0xb4/0xc8

Convert patch_lock to raw loc kto avoid this issue.

Although the problem is found on rt kernel, the fix should be applicable to
mainline kernel too.

Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-05 18:30:29 +01:00
Mark Salyzyn
569ba74a7b arm64: readahead: fault retry breaks mmap file read random detection
This is the arm64 portion of commit 45cac65b0f ("readahead: fault
retry breaks mmap file read random detection"), which was absent from
the initial port and has since gone unnoticed. The original commit says:

> .fault now can retry.  The retry can break state machine of .fault.  In
> filemap_fault, if page is miss, ra->mmap_miss is increased.  In the second
> try, since the page is in page cache now, ra->mmap_miss is decreased.  And
> these are done in one fault, so we can't detect random mmap file access.
>
> Add a new flag to indicate .fault is tried once.  In the second try, skip
> ra->mmap_miss decreasing.  The filemap_fault state machine is ok with it.

With this change, Mark reports that:

> Random read improves by 250%, sequential read improves by 40%, and
> random write by 400% to an eMMC device with dm crypto wrapped around it.

Cc: Shaohua Li <shli@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Riley Andrews <riandrews@android.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-05 16:30:50 +01:00
Yang Shi
95485fdc64 arm64: debug: Fix typo in debug-monitors.c
Fix comment typo: s/handers/handlers/

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-10-05 16:30:50 +01:00
Nicolas Schichan
8690f47d6e ARM: net: make BPF_LD | BPF_IND instruction trigger r_X initialisation to 0.
Without this patch, if the only instructions using r_X are of the
BPF_LD | BPF_IND type, r_X would not be reset to 0, using whatever
value was there when entering the jited code. With this patch, r_X
will be correctly marked as used so it will be reset to 0 in the
prologue code.

This fix also makes the test "LD_IND byte default X" pass in the
test_bpf module when the ARM JIT is enabled.

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-05 07:01:08 -07:00
Yousong Zhou
71a0a72456 MIPS: UAPI: Ignore __arch_swab{16,32,64} when using MIPS16
Some GCC versions (e.g. 4.8.3) can incorrectly inline a function with
MIPS32 instructions into another function with MIPS16 code [1], causing
the assembler to genereate incorrect binary code or fail right away
complaining about unrecognized opcode.

In the case of __arch_swab{16,32}, when inlined by the compiler with
flags `-mips32r2 -mips16 -Os', the assembler can fail with the following
error.

    {standard input}:79: Error: unrecognized opcode `wsbh $2,$2'

For performance concerns and to workaround the issue already existing in
older compilers, just ignore these 2 functions when compiling with
mips16 enabled.

 [1] Inlining nomips16 function into mips16 function can result in
     undefined builtins, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55777

Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11241/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-05 11:30:23 +02:00
Yousong Zhou
1bb3bf6226 Revert "MIPS: UAPI: Fix unrecognized opcode WSBH/DSBH/DSHD when using MIPS16."
This reverts commit e0d8b2ec53.

For at least GCC 4.8.3, adding nomips16 function attribute still cannot
prevent it from being inlined in mips16 context.  So revert it first in
preparation for a better workaround.

 [1] Inlining nomips16 function into mips16 function can result in
     undefined builtins, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55777

Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11240/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-05 11:29:57 +02:00
Linus Torvalds
30c44659f4 Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull strscpy string copy function implementation from Chris Metcalf.

Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.

The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.

strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result.  To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.

strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string.  Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated.  It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.

strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG.  It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.

So why did I waffle about this for so long?

Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.

And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.

So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches.  Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: use global strscpy() rather than private copy
  string: provide strscpy()
  Make asm/word-at-a-time.h available on all architectures
2015-10-04 16:31:13 +01:00
Linus Torvalds
0d8770815f Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 "This week's round of MIPS fixes:
   - Fix JZ4740 build
   - Fix fallback to GFP_DMA
   - FP seccomp in case of ENOSYS
   - Fix bootmem panic
   - A number of FP and CPS fixes
   - Wire up new syscalls
   - Make sure BPF assembler objects can properly be disassembled
   - Fix BPF assembler code for MIPS I"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: scall: Always run the seccomp syscall filters
  MIPS: Octeon: Fix kernel panic on startup from memory corruption
  MIPS: Fix R2300 FP context switch handling
  MIPS: Fix octeon FP context switch handling
  MIPS: BPF: Fix load delay slots.
  MIPS: BPF: Do all exports of symbols with FEXPORT().
  MIPS: Fix the build on jz4740 after removing the custom gpio.h
  MIPS: CPS: #ifdef on CONFIG_MIPS_MT_SMP rather than CONFIG_MIPS_MT
  MIPS: CPS: Don't include MT code in non-MT kernels.
  MIPS: CPS: Stop dangling delay slot from has_mt.
  MIPS: dma-default: Fix 32-bit fall back to GFP_DMA
  MIPS: Wire up userfaultfd and membarrier syscalls.
2015-10-04 11:41:58 +01:00
Markos Chandras
d218af7849 MIPS: scall: Always run the seccomp syscall filters
The MIPS syscall handler code used to return -ENOSYS on invalid
syscalls. Whilst this is expected, it caused problems for seccomp
filters because the said filters never had the change to run since
the code returned -ENOSYS before triggering them. This caused
problems on the chromium testsuite for filters looking for invalid
syscalls. This has now changed and the seccomp filters are always
run even if the syscall is invalid. We return -ENOSYS once we
return from the seccomp filters. Moreover, similar codepaths have
been merged in the process which simplifies somewhat the overall
syscall code.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11236/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-04 12:10:56 +02:00
Russell King
9a431bd5a2 ARM: make highpte an expert option
When highmem is enabled, there's little reason not to also enable
highpte support as well.  Rather than leaving this to chance, make
it an expert option, and default it to be enabled if highmem is
enabled.  Add some help text to the option while we're here.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:47:09 +01:00
Nicolas Pitre
63ce446c9b ARM: 8433/1: add a VMSPLIT_3G_OPT config option
Mimicking the same config option on x86, this allows for 1GB systems to
have their RAM entirely mapped as low memory.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:41:36 +01:00
Daniel Thompson
0768330d46 ARM: 8439/1: Fix backtrace generation when IPI is masked
Currently on ARM when <SysRq-L> is triggered from an interrupt handler
(e.g. a SysRq issued using UART or kbd) the main CPU will wedge for ten
seconds with interrupts masked before issuing a backtrace for every CPU
except itself.

The new backtrace code introduced by commit 96f0e00378 ("ARM: add
basic support for on-demand backtrace of other CPUs") does not work
correctly when run from an interrupt handler because IPI_CPU_BACKTRACE
is used to generate the backtrace on all CPUs but cannot preempt the
current calling context.

This can be fixed by detecting that the calling context cannot be
preempted and issuing the backtrace directly in this case. Issuing
directly leaves us without any pt_regs to pass to nmi_cpu_backtrace()
so we also modify the generic code to call dump_stack() when its
argument is NULL.

Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:40:51 +01:00
Doug Anderson
001bf455d2 ARM: 8428/1: kgdb: Fix registers on sleeping tasks
Dumping registers from other sleeping tasks in KGDB was totally
failing for me.  All registers were reported as 0 in many cases.

The code was using task_pt_regs(task) to try to get other thread
registers.  This doesn't appear to be the right place to look.  From
my tests, I saw non-zero values in this structure when we were looking
at a kernel thread that had a userspace task associated with it, but
it contained the register values from the userspace task.  So even in
the cases where registers weren't reported as 0 we were still not
showing the right thing.

Instead of using task_pt_regs(task) let's use task_thread_info(task).
This is the same place that is referred to when doing a dump of all
sleeping task stacks (kdb_show_stack() -> show_stack() ->
dump_backtrace() -> unwind_backtrace() -> thread_saved_sp()).

As further evidence that this is the right thing to do, you can find
the following comment in "gdbstub.c" right before it calls
sleeping_thread_to_gdb_regs():
  Pull stuff saved during switch_to; nothing else is accessible (or
  even particularly relevant).  This should be enough for a stack
  trace.
...and if you look at switch_to() it only saves r4-r11, sp and lr.
Those are the same registers that I'm getting out of the
task_thread_info().

With this change you can use "info thread" to see all tasks in the
kernel and you can switch to other tasks and examine them in gdb.

Signed-off-by: Doug Anderson <dianders@chromium.org>
Tested-by: Stephen Boyd <sboyd@codeurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:36:45 +01:00
Marek Szyprowski
7e31210349 ARM: 8427/1: dma-mapping: add support for offset parameter in dma_mmap()
IOMMU-based dma_mmap() implementation lacked proper support for offset
parameter used in mmap call (it always assumed that mapping starts from
offset zero). This patch adds support for offset parameter to IOMMU-based
implementation.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: stable@vger.kernel.org  # v3.6+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:36:45 +01:00
Marek Szyprowski
371f0f085f ARM: 8426/1: dma-mapping: add missing range check in dma_mmap()
dma_mmap() function in IOMMU-based dma-mapping implementation lacked
a check for valid range of mmap parameters (offset and buffer size), what
might have caused access beyond the allocated buffer. This patch fixes
this issue.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: stable@vger.kernel.org  # v3.6+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:36:45 +01:00
Russell King
db695c0509 ARM: remove user cmpxchg syscall
Mark Brand reports that a NEEDS_SYSCALL_FOR_CMPXCHG enabled kernel would
open a security hole in the ghost syscall used to implement cmpxchg, as
it fails to validate the user pointer.

However, in order for this option to be enabled, you'd need to be
building a pre-ARMv6 kernel with SMP support.  There is only one system
known which fits that, which is an early ARM SMP FPGA implementation
based on the ARM926T.

In any case, the Kconfig does not allow SMP to be enabled for pre-ARMv6
systems.

Moreover, even if NEEDS_SYSCALL_FOR_CMPXCHG were to be enabled, the
kernel would not build as __ARM_NR_cmpxchg64 is not defined.

The simple answer is to remove the buggy code.

Reported-by: Mark Brand <markbrand@google.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:36:45 +01:00
Stephen Boyd
6f56a68d0b ARM: 8438/1: Add unwinding to __clear_user_std()
Add unwinding annotations so that unwinding from this function
works properly.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:36:45 +01:00
Maninder Singh
b97b272e72 ARM: 8436/1: hw_breakpoint: remove unnecessary header
Header <asm/kdebug.h> is not needed for arm/hw_breakpoint.c, so remove
the pointless #include.

Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Reviewed-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:36:44 +01:00
Felipe Balbi
63c27ae798 ARM: 8434/2: Revert "7655/1: smp_twd: make twd_local_timer_of_register() no-op for nosmp"
This reverts commit 904464b91e.

The problem pointed out by commit 904464b91e ("ARM: 7655/1:
smp_twd: make twd_local_timer_of_register() no-op for nosmp")
doesn't exist anymore.

We can safely boot with nosmp and the warning won't show up.

The other side benefit of this patch is that TWD has a chance
to probe on single-core A9 systems such as AM437x which sport
TWD.

While at that, also drop SMP dependency from TWD's Kconfig entry.

Cc: Shawn Guo <shawn.guo@linaro.org>
Cc: Dirk Behme <dirk.behme@de.bosch.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-03 16:36:44 +01:00
Linus Torvalds
2cf30826bb Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Fixes all around the map: W+X kernel mapping fix, WCHAN fixes, two
  build failure fixes for corner case configs, x32 header fix and a
  speling fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds
  x86/mm: Set NX on gap between __ex_table and rodata
  x86/kexec: Fix kexec crash in syscall kexec_file_load()
  x86/process: Unify 32bit and 64bit implementations of get_wchan()
  x86/process: Add proper bound checks in 64bit get_wchan()
  x86, efi, kasan: Fix build failure on !KASAN && KMEMCHECK=y kernels
  x86/hyperv: Fix the build in the !CONFIG_KEXEC_CORE case
  x86/cpufeatures: Correct spelling of the HWP_NOTIFY flag
2015-10-03 10:53:05 -04:00
Linus Torvalds
a758379b03 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
 "Two EFI fixes: one for x86, one for ARM, fixing a boot crash bug that
  can trigger under newer EFI firmware"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arm64/efi: Fix boot crash by not padding between EFI_MEMORY_RUNTIME regions
  x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down
2015-10-03 10:46:41 -04:00
Linus Torvalds
5634347dee - Fix for transparent huge page change_protection() logic which was
inadvertently changing a huge pmd page into a pmd table entry.
 - Function graph tracer panic fix caused by the return_to_handler code
   corrupting the multi-regs function return value (composite types).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWDr76AAoJEGvWsS0AyF7xdKsP/1oE1dM/xXhQbYcJxXV3MgnT
 05pXmxxJUz7o2meVcbsz4c4UbhdHaQX2//jsgwxmoTNZo4EVz15c8GLWCPh5IRsw
 FQ/bVbDNmbOMZd4RSKShfIkW4bjelT5Mn/WuxUQoIX0qx316hmfFXMLCK2Gg7iOc
 hLkERWrbwHUynu0/lzE9EphOcLIGMmuT6n4qXtdhiLoFFMg8iuKDoxetj14oR3GC
 LQ5JHpvnS6ECLl50RbVvWLCSymnfhzveGvW/d58rFHFRY5PnjV2LATfLCkaKiz8h
 szxJLFuZZzP0lmhOZ9LUaRnNwTUFx5sg0FMEJaLimnTWZ2KmvxBgMuZz+vutjjlz
 DHsQQWVVW771Yzv4vWkv/4oAd/IMcoZFLaAjVYxcjzEFC/kB/i1zRSe8BMxdTs1u
 xqIi3Iv6c7Kv7VdANfTuR9zvFDPRSLoK1UEqQ0Sdvg9NuP8rPrn2ZaMyL1fIwxaL
 AO9JTAWqCYhgWXfeCAQYI1aDEdeE1ndK7a6eX6RDu1nRupQAHfTvV+DwfLRTF6g2
 T3IwfcDuquZHNaKBR6CIgF0xSzyfk7Wsbf3QPqtIGjGsyoHfrcf/9y0b3yNxXNq9
 GEepvrYQfdoP2xhwOyDK+8kNt0HxMiCrrPD0dni95No8DDct1TJ3kPnBdWyfAWLi
 sNNSuGbqMTRpONnuC9kK
 =AJCF
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Fix for transparent huge page change_protection() logic which was
   inadvertently changing a huge pmd page into a pmd table entry.

 - Function graph tracer panic fix caused by the return_to_handler code
   corrupting the multi-regs function return value (composite types).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: ftrace: fix function_graph tracer panic
  arm64: Fix THP protection change logic
2015-10-02 14:54:16 -04:00
Linus Torvalds
b55a97e759 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
 "Summary:
   - Fix for accidental modification of arguments of syscall functions
   - Wire up new syscalls
   - Update defconfigs"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k/defconfig: Update defconfigs for v4.3-rc1
  m68k: Define asmlinkage_protect
  m68k: Wire up membarrier
  m68k: Wire up userfaultfd
  m68k: Wire up direct socket calls
2015-10-02 14:51:46 -04:00
Matt Bennett
66803dd919 MIPS: Octeon: Fix kernel panic on startup from memory corruption
During development it was found that a number of builds would panic
during the kernel init process, more specifically in 'delayed_fput()'.
The panic showed the kernel trying to access a memory address of
'0xb7fdc00' while traversing the 'delayed_fput_list' structure.
Comparing this memory address to the value of the pointer used on
builds that did not panic confirmed that the pointer on crashing
builds must have been corrupted at some stage earlier in the init
process.

By traversing the list earlier and earlier in the code it was found
that 'plat_mem_setup()' was responsible for corrupting the list.
Specifically the line:

    memory = cvmx_bootmem_phy_alloc(mem_alloc_size,
			__pa_symbol(&__init_end), -1,
			0x100000,
			CVMX_BOOTMEM_FLAG_NO_LOCKING);

Which would eventually call:

    cvmx_bootmem_phy_set_size(new_ent_addr,
		cvmx_bootmem_phy_get_size
		(ent_addr) -
		(desired_min_addr -
			ent_addr));

Where 'new_ent_addr'=0x4800000 (the address of 'delayed_fput_list')
and the second argument (size)=0xb7fdc00 (the address causing the
kernel panic). The job of this part of 'plat_mem_setup()' is to
allocate chunks of memory for the kernel to use. At the start of
each chunk of memory the size of the chunk is written, hence the
value 0xb7fdc00 is written onto memory at 0x4800000, therefore the
kernel panics when it goes back to access 'delayed_fput_list' later
on in the initialisation process.

On builds that were not crashing it was found that the compiler had
placed 'delayed_fput_list' at 0x4800008, meaning it wasn't corrupted
(but something else in memory was overwritten).

As can be seen in the first function call above the code begins to
allocate chunks of memory beginning from the symbol '__init_end'.
The MIPS linker script (vmlinux.lds.S) however defines the .bss
section to begin after '__init_end'. Therefore memory within the
.bss section is allocated to the kernel to use (System.map shows
'delayed_fput_list' and other kernel structures to be in .bss).

To stop the kernel panic (and the .bss section being corrupted)
memory should begin being allocated from the symbol '_end'.

Signed-off-by: Matt Bennett <matt.bennett@alliedtelesis.co.nz>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: aleksey.makarov@auriga.com
Patchwork: https://patchwork.linux-mips.org/patch/11251/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-02 19:19:55 +02:00
Paul Burton
085c2f25d3 MIPS: Fix R2300 FP context switch handling
Commit 1a3d59579b ("MIPS: Tidy up FPU context switching") removed FP
context saving from the asm-written resume function in favour of reusing
existing code to perform the same task. However it only removed the FP
context saving code from the r4k_switch.S implementation of resume.
Remove it from the r2300_switch.S implementation too in order to prevent
attempting to save the FP context twice, which would likely lead to an
exception from the second save because the FPU had already been disabled
by the first save.

This patch has only been build tested, using rbtx49xx_defconfig.

Fixes: 1a3d59579b ("MIPS: Tidy up FPU context switching")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: linux-kernel@vger.kernel.org
Cc: Manuel Lauss <manuel.lauss@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/11167/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-02 19:16:46 +02:00
Paul Burton
0fa24340f7 MIPS: Fix octeon FP context switch handling
Commit 1a3d59579b ("MIPS: Tidy up FPU context switching") removed FP
context saving from the asm-written resume function in favour of reusing
existing code to perform the same task. However it only removed the FP
context saving code from the r4k_switch.S implementation of resume.
Octeon uses its own implementation in octeon_switch.S, so remove FP
context saving there too in order to prevent attempting to save context
twice. That formerly led to an exception from the second save as follows
because the FPU had already been disabled by the first save:

    do_cpu invoked from kernel context![#1]:
    CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.3.0-rc2-dirty #2
    task: 800000041f84a008 ti: 800000041f864000 task.ti: 800000041f864000
    $ 0   : 0000000000000000 0000000010008ce1 0000000000100000 ffffffffbfffffff
    $ 4   : 800000041f84a008 800000041f84ac08 800000041f84c000 0000000000000004
    $ 8   : 0000000000000001 0000000000000000 0000000000000000 0000000000000001
    $12   : 0000000010008ce3 0000000000119c60 0000000000000036 800000041f864000
    $16   : 800000041f84ac08 800000000792ce80 800000041f84a008 ffffffff81758b00
    $20   : 0000000000000000 ffffffff8175ae50 0000000000000000 ffffffff8176c740
    $24   : 0000000000000006 ffffffff81170300
    $28   : 800000041f864000 800000041f867d90 0000000000000000 ffffffff815f3fa0
    Hi    : 0000000000fa8257
    Lo    : ffffffffe15cfc00
    epc   : ffffffff8112821c resume+0x9c/0x200
    ra    : ffffffff815f3fa0 __schedule+0x3f0/0x7d8
    Status: 10008ce2        KX SX UX KERNEL EXL
    Cause : 1080002c (ExcCode 0b)
    PrId  : 000d0601 (Cavium Octeon+)
    Modules linked in:
    Process kthreadd (pid: 2, threadinfo=800000041f864000, task=800000041f84a008, tls=0000000000000000)
    Stack : ffffffff81604218 ffffffff815f7e08 800000041f84a008 ffffffff811681b0
              800000041f84a008 ffffffff817e9878 0000000000000000 ffffffff81770000
              ffffffff81768340 ffffffff81161398 0000000000000001 0000000000000000
              0000000000000000 ffffffff815f4424 0000000000000000 ffffffff81161d68
              ffffffff81161be8 0000000000000000 0000000000000000 0000000000000000
              0000000000000000 0000000000000000 0000000000000000 ffffffff8111e16c
              0000000000000000 0000000000000000 0000000000000000 0000000000000000
              0000000000000000 0000000000000000 0000000000000000 0000000000000000
              0000000000000000 0000000000000000 0000000000000000 0000000000000000
              0000000000000000 0000000000000000 0000000000000000 0000000000000000
              ...
    Call Trace:
    [<ffffffff8112821c>] resume+0x9c/0x200
    [<ffffffff815f3fa0>] __schedule+0x3f0/0x7d8
    [<ffffffff815f4424>] schedule+0x34/0x98
    [<ffffffff81161d68>] kthreadd+0x180/0x198
    [<ffffffff8111e16c>] ret_from_kernel_thread+0x14/0x1c

Tested using cavium_octeon_defconfig on an EdgeRouter Lite.

Fixes: 1a3d59579b ("MIPS: Tidy up FPU context switching")
Reported-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Aleksey Makarov <aleksey.makarov@auriga.com>
Cc: linux-kernel@vger.kernel.org
Cc: Chandrakala Chavva <cchavva@caviumnetworks.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Leonid Rosenboim <lrosenboim@caviumnetworks.com>
Patchwork: https://patchwork.linux-mips.org/patch/11166/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-02 19:16:06 +02:00
Li Bin
ee556d00cf arm64: ftrace: fix function_graph tracer panic
When function graph tracer is enabled, the following operation
will trigger panic:

mount -t debugfs nodev /sys/kernel
echo next_tgid > /sys/kernel/tracing/set_ftrace_filter
echo function_graph > /sys/kernel/tracing/current_tracer
ls /proc/

------------[ cut here ]------------
[  198.501417] Unable to handle kernel paging request at virtual address cb88537fdc8ba316
[  198.506126] pgd = ffffffc008f79000
[  198.509363] [cb88537fdc8ba316] *pgd=00000000488c6003, *pud=00000000488c6003, *pmd=0000000000000000
[  198.517726] Internal error: Oops: 94000005 [#1] SMP
[  198.518798] Modules linked in:
[  198.520582] CPU: 1 PID: 1388 Comm: ls Tainted: G
[  198.521800] Hardware name: linux,dummy-virt (DT)
[  198.522852] task: ffffffc0fa9e8000 ti: ffffffc0f9ab0000 task.ti: ffffffc0f9ab0000
[  198.524306] PC is at next_tgid+0x30/0x100
[  198.525205] LR is at return_to_handler+0x0/0x20
[  198.526090] pc : [<ffffffc0002a1070>] lr : [<ffffffc0000907c0>] pstate: 60000145
[  198.527392] sp : ffffffc0f9ab3d40
[  198.528084] x29: ffffffc0f9ab3d40 x28: ffffffc0f9ab0000
[  198.529406] x27: ffffffc000d6a000 x26: ffffffc000b786e8
[  198.530659] x25: ffffffc0002a1900 x24: ffffffc0faf16c00
[  198.531942] x23: ffffffc0f9ab3ea0 x22: 0000000000000002
[  198.533202] x21: ffffffc000d85050 x20: 0000000000000002
[  198.534446] x19: 0000000000000002 x18: 0000000000000000
[  198.535719] x17: 000000000049fa08 x16: ffffffc000242efc
[  198.537030] x15: 0000007fa472b54c x14: ffffffffff000000
[  198.538347] x13: ffffffc0fada84a0 x12: 0000000000000001
[  198.539634] x11: ffffffc0f9ab3d70 x10: ffffffc0f9ab3d70
[  198.540915] x9 : ffffffc0000907c0 x8 : ffffffc0f9ab3d40
[  198.542215] x7 : 0000002e330f08f0 x6 : 0000000000000015
[  198.543508] x5 : 0000000000000f08 x4 : ffffffc0f9835ec0
[  198.544792] x3 : cb88537fdc8ba316 x2 : cb88537fdc8ba306
[  198.546108] x1 : 0000000000000002 x0 : ffffffc000d85050
[  198.547432]
[  198.547920] Process ls (pid: 1388, stack limit = 0xffffffc0f9ab0020)
[  198.549170] Stack: (0xffffffc0f9ab3d40 to 0xffffffc0f9ab4000)
[  198.582568] Call trace:
[  198.583313] [<ffffffc0002a1070>] next_tgid+0x30/0x100
[  198.584359] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
[  198.585503] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
[  198.586574] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
[  198.587660] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
[  198.588896] Code: aa0003f5 2a0103f4 b4000102 91004043 (885f7c60)
[  198.591092] ---[ end trace 6a346f8f20949ac8 ]---

This is because when using function graph tracer, if the traced
function return value is in multi regs ([x0-x7]), return_to_handler
may corrupt them. So in return_to_handler, the parameter regs should
be protected properly.

Cc: <stable@vger.kernel.org> # 3.18+
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Acked-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-10-02 11:12:56 +01:00
Ralf Baechle
0c5d187828 MIPS: BPF: Fix load delay slots.
The entire bpf_jit_asm.S is written in noreorder mode because "we know
better" according to a comment.  This also prevented the assembler from
throwing in the required NOPs for MIPS I processors which have no
load-use interlock, thus the load's consumer might end up using the
old value of the register from prior to the load.

Fixed by putting the assembler in reorder mode for just the affected
load instructions.  This is not enough for gas to actually try to be
clever by looking at the next instruction and inserting a nop only
when needed but as the comment said "we know better", so getting gas
to unconditionally emit a NOP is just right in this case and prevents
adding further ifdefery.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-02 09:48:57 +02:00
Ben Hutchings
f4b4aae182 x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds
On x32, gcc predefines __x86_64__ but long is only 32-bit.  Use
__ILP32__ to distinguish x32.

Fixes this compiler error in perf:

	tools/include/asm-generic/bitops/__ffs.h: In function '__ffs':
	tools/include/asm-generic/bitops/__ffs.h:19:8: error: right shift count >= width of type [-Werror=shift-count-overflow]
	  word >>= 32;
	       ^

This isn't sufficient to build perf for x32, though.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443660043.2730.15.camel@decadent.org.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-02 09:43:21 +02:00
Stephen Smalley
ab76f7b4ab x86/mm: Set NX on gap between __ex_table and rodata
Unused space between the end of __ex_table and the start of
rodata can be left W+x in the kernel page tables.  Extend the
setting of the NX bit to cover this gap by starting from
text_end rather than rodata_start.

  Before:
  ---[ High Kernel Mapping ]---
  0xffffffff80000000-0xffffffff81000000          16M                               pmd
  0xffffffff81000000-0xffffffff81600000           6M     ro         PSE     GLB x  pmd
  0xffffffff81600000-0xffffffff81754000        1360K     ro                 GLB x  pte
  0xffffffff81754000-0xffffffff81800000         688K     RW                 GLB x  pte
  0xffffffff81800000-0xffffffff81a00000           2M     ro         PSE     GLB NX pmd
  0xffffffff81a00000-0xffffffff81b3b000        1260K     ro                 GLB NX pte
  0xffffffff81b3b000-0xffffffff82000000        4884K     RW                 GLB NX pte
  0xffffffff82000000-0xffffffff82200000           2M     RW         PSE     GLB NX pmd
  0xffffffff82200000-0xffffffffa0000000         478M                               pmd

  After:
  ---[ High Kernel Mapping ]---
  0xffffffff80000000-0xffffffff81000000          16M                               pmd
  0xffffffff81000000-0xffffffff81600000           6M     ro         PSE     GLB x  pmd
  0xffffffff81600000-0xffffffff81754000        1360K     ro                 GLB x  pte
  0xffffffff81754000-0xffffffff81800000         688K     RW                 GLB NX pte
  0xffffffff81800000-0xffffffff81a00000           2M     ro         PSE     GLB NX pmd
  0xffffffff81a00000-0xffffffff81b3b000        1260K     ro                 GLB NX pte
  0xffffffff81b3b000-0xffffffff82000000        4884K     RW                 GLB NX pte
  0xffffffff82000000-0xffffffff82200000           2M     RW         PSE     GLB NX pmd
  0xffffffff82200000-0xffffffffa0000000         478M                               pmd

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443704662-3138-1-git-send-email-sds@tycho.nsa.gov
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-02 09:21:06 +02:00
Lee, Chun-Yi
e3c41e37b0 x86/kexec: Fix kexec crash in syscall kexec_file_load()
The original bug is a page fault crash that sometimes happens
on big machines when preparing ELF headers:

    BUG: unable to handle kernel paging request at ffffc90613fc9000
    IP: [<ffffffff8103d645>] prepare_elf64_ram_headers_callback+0x165/0x260

The bug is caused by us under-counting the number of memory ranges
and subsequently not allocating enough ELF header space for them.
The bug is typically masked on smaller systems, because the ELF header
allocation is rounded up to the next page.

This patch modifies the code in fill_up_crash_elf_data() by using
walk_system_ram_res() instead of walk_system_ram_range() to correctly
count the max number of crash memory ranges. That's because the
walk_system_ram_range() filters out small memory regions that
reside in the same page, but walk_system_ram_res() does not.

Here's how I found the bug:

After tracing prepare_elf64_headers() and prepare_elf64_ram_headers_callback(),
the code uses walk_system_ram_res() to fill-in crash memory regions information
to the program header, so it counts those small memory regions that
reside in a page area.

But, when the kernel was using walk_system_ram_range() in
fill_up_crash_elf_data() to count the number of crash memory regions,
it filters out small regions.

I printed those small memory regions, for example:

  kexec: Get nr_ram ranges. vaddr=0xffff880077592258 paddr=0x77592258, sz=0xdc0

Based on the code in walk_system_ram_range(), this memory region
will be filtered out:

  pfn = (0x77592258 + 0x1000 - 1) >> 12 = 0x77593
  end_pfn = (0x77592258 + 0xfc0 -1 + 1) >> 12 = 0x77593
  end_pfn - pfn = 0x77593 - 0x77593 = 0  <=== if (end_pfn > pfn) is FALSE

So, the max_nr_ranges that's counted by the kernel doesn't include
small memory regions - causing us to under-allocate the required space.
That causes the page fault crash that happens in a later code path
when preparing ELF headers.

This bug is not easy to reproduce on small machines that have few
CPUs, because the allocated page aligned ELF buffer has more free
space to cover those small memory regions' PT_LOAD headers.

Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443531537-29436-1-git-send-email-jlee@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-02 09:13:06 +02:00
Linus Torvalds
bde17b90dd Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "12 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  dmapool: fix overflow condition in pool_find_page()
  thermal: avoid division by zero in power allocator
  memcg: remove pcp_counter_lock
  kprobes: use _do_fork() in samples to make them work again
  drivers/input/joystick/Kconfig: zhenhua.c needs BITREVERSE
  memcg: make mem_cgroup_read_stat() unsigned
  memcg: fix dirty page migration
  dax: fix NULL pointer in __dax_pmd_fault()
  mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault
  mm/slab: fix unexpected index mapping result of kmalloc_size(INDEX_NODE+1)
  userfaultfd: remove kernel header include from uapi header
  arch/x86/include/asm/efi.h: fix build failure
2015-10-01 22:20:11 -04:00
Andrey Ryabinin
a523841ee4 arch/x86/include/asm/efi.h: fix build failure
With KMEMCHECK=y, KASAN=n:

  arch/x86/platform/efi/efi.c:673:3: error: implicit declaration of function `memcpy' [-Werror=implicit-function-declaration]
  arch/x86/platform/efi/efi_64.c:139:2: error: implicit declaration of function `memcpy' [-Werror=implicit-function-declaration]
  arch/x86/include/asm/desc.h:121:2: error: implicit declaration of function `memcpy' [-Werror=implicit-function-declaration]

Don't #undef memcpy if KASAN=n.

Fixes: 769a8089c1 ("x86, efi, kasan: #undef memset/memcpy/memmove per arch")
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-10-01 21:42:35 -04:00
Linus Torvalds
ccf70ddcbe (Relatively) a lot of reverts, mostly.
Bugs have trickled in for a new feature in 4.2 (MTRR support in guests)
 so I'm reverting it all; let's not make this -rc period busier for KVM
 than it's been so far.  This covers the four reverts from me.
 
 The fifth patch is being reverted because Radim found a bug in the
 implementation of stable scheduler clock, *but* also managed to implement
 the feature entirely without hypervisor support.  So instead of fixing
 the hypervisor side we can remove it completely; 4.4 will get the new
 implementation.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWDXc/AAoJEL/70l94x66D8GoH/0WXeSYHn8+Ql5oZ5vI0QcCG
 6MiKVixhHTOpkug2QE4DGClYoFSUPuDEB/w6D7YciNn0quDHFZbI3XEMXYtLobHN
 0J9cMv9Vpy5pBVMG/LJOw9pFAJRdhSx/cHU2DW9vUiRG9dO9zuxFzBtUciWLOPAX
 tSQfDumeUV30BsTP5ldi9kaIUJBM9oBD4JhES0JHx6ePBvy+9vCRmHotugzrrGx6
 N+AbCmwUwxnK29PF9i7KMfex6T8l1uQG3fwWVazHoswsqbFEQyF6NpaSTYoZkjM9
 6gaXEE1FQ7tRhuio4bBDos0lLu6iGesveP71p/HpULleq2sbH2ER8TpzR5iSnQA=
 =zAJS
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "(Relatively) a lot of reverts, mostly.

  Bugs have trickled in for a new feature in 4.2 (MTRR support in
  guests) so I'm reverting it all; let's not make this -rc period busier
  for KVM than it's been so far.  This covers the four reverts from me.

  The fifth patch is being reverted because Radim found a bug in the
  implementation of stable scheduler clock, *but* also managed to
  implement the feature entirely without hypervisor support.  So instead
  of fixing the hypervisor side we can remove it completely; 4.4 will
  get the new implementation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS
  Update KVM homepage Url
  Revert "KVM: SVM: use NPT page attributes"
  Revert "KVM: svm: handle KVM_X86_QUIRK_CD_NW_CLEARED in svm_get_mt_mask"
  Revert "KVM: SVM: Sync g_pat with guest-written PAT value"
  Revert "KVM: x86: apply guest MTRR virtualization on host reserved pages"
  Revert "KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR"
2015-10-01 16:43:25 -04:00
Thomas Hebb
1f744fd317 ARM: dts: berlin: change BG2Q's USB PHY compatible
Currently, BG2Q shares a compatible with BG2. This is incorrect, since
BG2 and BG2Q use different USB PLL dividers. In reality, BG2Q shares a
divider with BG2CD. Change BG2Q's USB PHY compatible string to reflect
that.

Cc: <stable@vger.kernel.org> # v4.2.0-
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
2015-10-01 21:07:15 +02:00
Steve Capper
1a541b4e3c arm64: Fix THP protection change logic
6910fa1 ("arm64: enable PTE type bit in the mask for pte_modify") fixes
a problem whereby a large block of PROT_NONE mapped memory is
incorrectly mapped as block descriptors when mprotect is called.

Unfortunately, a subtle bug was introduced by this fix to the THP logic.

If one mmaps a large block of memory, then faults it such that it is
collapsed into THPs; resulting calls to mprotect on this area of memory
will lead to incorrect table descriptors being written instead of block
descriptors. This is because pmd_modify calls pte_modify which is now
allowed to modify the type of the page table entry.

This patch reverts commit 6910fa16db, and
fixes the problem it was trying to address by adjusting PAGE_NONE to
represent a table entry. Thus no change in pte type is required when
moving from PROT_NONE to a different protection.

Fixes: 6910fa16db ("arm64: enable PTE type bit in the mask for pte_modify")
Cc: <stable@vger.kernel.org> # 4.0+
Cc: Feng Kan <fkan@apm.com>
Reported-by: Ganapatrao Kulkarni <Ganapatrao.Kulkarni@caviumnetworks.com>
Tested-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-10-01 18:02:21 +01:00
Thomas Gleixner
bebcb8dab6 Merge branch 'irq/for-arm' into irq/core
Bring in the change which we offered arm[64] folks to pull into their
trees.
2015-10-01 16:14:24 +02:00
Ralf Baechle
1e16a8f116 MIPS: BPF: Do all exports of symbols with FEXPORT().
FEXPORT also marks the symbol as code using .type symbol, @function.
Without objdump -d will output only a hexdump for code following the
affected symbols.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-01 15:45:44 +02:00
Dirk Müller
d2922422c4 Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS
The cpu feature flags are not ever going to change, so warning
everytime can cause a lot of kernel log spam
(in our case more than 10GB/hour).

The warning seems to only occur when nested virtualization is
enabled, so it's probably triggered by a KVM bug.  This is a
sensible and safe change anyway, and the KVM bug fix might not
be suitable for stable releases anyway.

Cc: stable@vger.kernel.org
Signed-off-by: Dirk Mueller <dmueller@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01 14:59:37 +02:00
Paolo Bonzini
fc07e76ac7 Revert "KVM: SVM: use NPT page attributes"
This reverts commit 3c2e7f7de3.
Initializing the mapping from MTRR to PAT values was reported to
fail nondeterministically, and it also caused extremely slow boot
(due to caching getting disabled---bug 103321) with assigned devices.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: Sebastian Schuette <dracon@ewetel.net>
Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01 13:30:44 +02:00
Paolo Bonzini
bcf166a994 Revert "KVM: svm: handle KVM_X86_QUIRK_CD_NW_CLEARED in svm_get_mt_mask"
This reverts commit 5492830370.
It builds on the commit that is being reverted next.

Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01 13:30:43 +02:00
Paolo Bonzini
625422f60c Revert "KVM: SVM: Sync g_pat with guest-written PAT value"
This reverts commit e098223b78,
which has a dependency on other commits being reverted.

Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01 13:30:43 +02:00
Paolo Bonzini
606decd670 Revert "KVM: x86: apply guest MTRR virtualization on host reserved pages"
This reverts commit fd717f1101.
It was reported to cause Machine Check Exceptions (bug 104091).

Reported-by: harn-solo@gmx.de
Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-10-01 13:30:42 +02:00
Ard Biesheuvel
0ce3cc008e arm64/efi: Fix boot crash by not padding between EFI_MEMORY_RUNTIME regions
The new Properties Table feature introduced in UEFIv2.5 may
split memory regions that cover PE/COFF memory images into
separate code and data regions. Since these regions only differ
in the type (runtime code vs runtime data) and the permission
bits, but not in the memory type attributes (UC/WC/WT/WB), the
spec does not require them to be aligned to 64 KB.

Since the relative offset of PE/COFF .text and .data segments
cannot be changed on the fly, this means that we can no longer
pad out those regions to be mappable using 64 KB pages.
Unfortunately, there is no annotation in the UEFI memory map
that identifies data regions that were split off from a code
region, so we must apply this logic to all adjacent runtime
regions whose attributes only differ in the permission bits.

So instead of rounding each memory region to 64 KB alignment at
both ends, only round down regions that are not directly
preceded by another runtime region with the same type
attributes. Since the UEFI spec does not mandate that the memory
map be sorted, this means we also need to sort it first.

Note that this change will result in all EFI_MEMORY_RUNTIME
regions whose start addresses are not aligned to the OS page
size to be mapped with executable permissions (i.e., on kernels
compiled with 64 KB pages). However, since these mappings are
only active during the time that UEFI Runtime Services are being
invoked, the window for abuse is rather small.

Tested-by: Mark Salter <msalter@redhat.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [UEFI 2.4 only]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Reviewed-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org> # v4.0+
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443218539-7610-3-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-01 12:51:28 +02:00
Matt Fleming
a5caa209ba x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down
Beginning with UEFI v2.5 EFI_PROPERTIES_TABLE was introduced
that signals that the firmware PE/COFF loader supports splitting
code and data sections of PE/COFF images into separate EFI
memory map entries. This allows the kernel to map those regions
with strict memory protections, e.g. EFI_MEMORY_RO for code,
EFI_MEMORY_XP for data, etc.

Unfortunately, an unwritten requirement of this new feature is
that the regions need to be mapped with the same offsets
relative to each other as observed in the EFI memory map. If
this is not done crashes like this may occur,

  BUG: unable to handle kernel paging request at fffffffefe6086dd
  IP: [<fffffffefe6086dd>] 0xfffffffefe6086dd
  Call Trace:
   [<ffffffff8104c90e>] efi_call+0x7e/0x100
   [<ffffffff81602091>] ? virt_efi_set_variable+0x61/0x90
   [<ffffffff8104c583>] efi_delete_dummy_variable+0x63/0x70
   [<ffffffff81f4e4aa>] efi_enter_virtual_mode+0x383/0x392
   [<ffffffff81f37e1b>] start_kernel+0x38a/0x417
   [<ffffffff81f37495>] x86_64_start_reservations+0x2a/0x2c
   [<ffffffff81f37582>] x86_64_start_kernel+0xeb/0xef

Here 0xfffffffefe6086dd refers to an address the firmware
expects to be mapped but which the OS never claimed was mapped.
The issue is that included in these regions are relative
addresses to other regions which were emitted by the firmware
toolchain before the "splitting" of sections occurred at
runtime.

Needless to say, we don't satisfy this unwritten requirement on
x86_64 and instead map the EFI memory map entries in reverse
order. The above crash is almost certainly triggerable with any
kernel newer than v3.13 because that's when we rewrote the EFI
runtime region mapping code, in commit d2f7cbe7b2 ("x86/efi:
Runtime services virtual mapping"). For kernel versions before
v3.13 things may work by pure luck depending on the
fragmentation of the kernel virtual address space at the time we
map the EFI regions.

Instead of mapping the EFI memory map entries in reverse order,
where entry N has a higher virtual address than entry N+1, map
them in the same order as they appear in the EFI memory map to
preserve this relative offset between regions.

This patch has been kept as small as possible with the intention
that it should be applied aggressively to stable and
distribution kernels. It is very much a bugfix rather than
support for a new feature, since when EFI_PROPERTIES_TABLE is
enabled we must map things as outlined above to even boot - we
have no way of asking the firmware not to split the code/data
regions.

In fact, this patch doesn't even make use of the more strict
memory protections available in UEFI v2.5. That will come later.

Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chun-Yi <jlee@suse.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: James Bottomley <JBottomley@Odin.com>
Cc: Lee, Chun-Yi <jlee@suse.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443218539-7610-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-01 12:51:28 +02:00
Geliang Tang
a7e705af52 x86/irq: Drop unlikely before IS_ERR_OR_NULL
IS_ERR_OR_NULL already contain an unlikely compiler flag. Drop it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/03d18502ed7ed417f136c091f417d2d88c147ec6.1443667610.git.geliangtang@163.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-10-01 11:08:56 +02:00
Vaishali Thakkar
c2365b9388 perf/x86/intel/uncore: Do not use macro DEFINE_PCI_DEVICE_TABLE()
The DEFINE_PCI_DEVICE_TABLE() macro is deprecated. Use
'struct pci_device_id' instead of DEFINE_PCI_DEVICE_TABLE(),
with the goal of getting rid of this macro completely.

This Coccinelle semantic patch performs this transformation:

	@@
	identifier a;
	declarer name DEFINE_PCI_DEVICE_TABLE;
	initializer i;
	@@
	- DEFINE_PCI_DEVICE_TABLE(a)
	+ const struct pci_device_id a[] = i;

Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20151001085201.GA16939@localhost
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-01 10:53:03 +02:00
Sebastian Ott
daad0bf149 s390/defconfig: set SCSI_DH=y
Fix this warning:
arch/s390/configs/performance_defconfig:380:warning: symbol value 'm' invalid for SCSI_DH

Introduced via 086b91d052
(scsi_dh: integrate into the core SCSI code)

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-01 10:48:36 +02:00
Alban Bedel
5b235dc264 MIPS: Fix the build on jz4740 after removing the custom gpio.h
Somehow the wrong version of the patch to remove the use of custom
gpio.h on mips has been merged. This patch add the missing fixes for a
build error on jz4740 because linux/gpio.h doesn't provide any machine
specfics definitions anymore.

Signed-off-by: Alban Bedel <albeu@free.fr>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11089/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-10-01 09:06:26 +02:00
Ingo Molnar
95c632f4e4 Merge remote-tracking branch 'tglx/x86/urgent' into x86/urgent
Pick up the WCHAN fixes from Thomas.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-01 09:02:11 +02:00
Denys Vlasenko
dae0f305d6 x86/signal: Deinline get_sigframe, save 240 bytes
This function compiles to 277 bytes of machine code and has 4 callsites.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Link: http://lkml.kernel.org/r/1443443037-22077-4-git-send-email-dvlasenk@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-30 21:54:40 +02:00
Denys Vlasenko
c368ef2866 x86: Deinline early_console_register, save 403 bytes
This function compiles to 60 bytes of machine code.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Link: http://lkml.kernel.org/r/1443443037-22077-3-git-send-email-dvlasenk@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-30 21:54:40 +02:00
Denys Vlasenko
e6e5f84092 x86/e820: Deinline e820_type_to_string, save 126 bytes
This function compiles to 102 bytes of machine code. It has two
callsites.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Link: http://lkml.kernel.org/r/1443443037-22077-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-30 21:54:40 +02:00
Thomas Gleixner
7ba78053aa x86/process: Unify 32bit and 64bit implementations of get_wchan()
The stack layout and the functionality is identical. Use the 64bit
version for all of x86.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Link: http://lkml.kernel.org/r/20150930083302.779694618@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-30 21:51:34 +02:00
Thomas Gleixner
eddd3826a1 x86/process: Add proper bound checks in 64bit get_wchan()
Dmitry Vyukov reported the following using trinity and the memory
error detector AddressSanitizer
(https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel).

[ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on
address ffff88002e280000
[ 124.576801] ffff88002e280000 is located 131938492886538 bytes to
the left of 28857600-byte region [ffffffff81282e0a, ffffffff82e0830a)
[ 124.578633] Accessed by thread T10915:
[ 124.579295] inlined in describe_heap_address
./arch/x86/mm/asan/report.c:164
[ 124.579295] #0 ffffffff810dd277 in asan_report_error
./arch/x86/mm/asan/report.c:278
[ 124.580137] #1 ffffffff810dc6a0 in asan_check_region
./arch/x86/mm/asan/asan.c:37
[ 124.581050] #2 ffffffff810dd423 in __tsan_read8 ??:0
[ 124.581893] #3 ffffffff8107c093 in get_wchan
./arch/x86/kernel/process_64.c:444

The address checks in the 64bit implementation of get_wchan() are
wrong in several ways:

 - The lower bound of the stack is not the start of the stack
   page. It's the start of the stack page plus sizeof (struct
   thread_info)

 - The upper bound must be:

       top_of_stack - TOP_OF_KERNEL_STACK_PADDING - 2 * sizeof(unsigned long).

   The 2 * sizeof(unsigned long) is required because the stack pointer
   points at the frame pointer. The layout on the stack is: ... IP FP
   ... IP FP. So we need to make sure that both IP and FP are in the
   bounds.

Fix the bound checks and get rid of the mix of numeric constants, u64
and unsigned long. Making all unsigned long allows us to use the same
function for 32bit as well.

Use READ_ONCE() when accessing the stack. This does not prevent a
concurrent wakeup of the task and the stack changing, but at least it
avoids TOCTOU.

Also check task state at the end of the loop. Again that does not
prevent concurrent changes, but it avoids walking for nothing.

Add proper comments while at it.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Based-on-patch-from: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20150930083302.694788319@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-30 21:51:34 +02:00
Thomas Gleixner
09907ca630 Merge branch 'x86/for-kvm' into x86/apic
Pull in the apic change which is provided for kvm folks to pull into
their tree.
2015-09-30 21:20:39 +02:00
Paolo Bonzini
e02ae38713 x86/x2apic: Make stub functions available even if !CONFIG_X86_LOCAL_APIC
Some CONFIG_X86_X2APIC functions, especially x2apic_enabled(), are not
declared if !CONFIG_X86_LOCAL_APIC.  However, the same stubs that work
for !CONFIG_X86_X2APIC are okay even if there is no local APIC support
at all.

Avoid the introduction of #ifdefs by moving the x2apic declarations
completely outside the CONFIG_X86_LOCAL_APIC block.  (Unfortunately,
diff generation messes up the actual change that this patch makes).
There is no semantic change because CONFIG_X86_X2APIC depends on
CONFIG_X86_LOCAL_APIC.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Feng Wu <feng.wu@intel.com>
Link: http://lkml.kernel.org/r/1443435991-35750-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-30 21:17:36 +02:00
Denys Vlasenko
d786ad32c3 x86/apic: Deinline various functions
__x2apic_disable: 178 bytes, 3 calls
__x2apic_enable: 117 bytes, 3 calls
__smp_spurious_interrupt: 110 bytes, 2 calls
__smp_error_interrupt: 208 bytes, 2 calls

Reduces code size by about 850 bytes.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/1443559022-23793-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-30 21:15:53 +02:00
Paul Burton
7a63076d9a MIPS: CPS: #ifdef on CONFIG_MIPS_MT_SMP rather than CONFIG_MIPS_MT
The CONFIG_MIPS_MT symbol can be selected by CONFIG_MIPS_VPE_LOADER in
addition to CONFIG_MIPS_MT_SMP. We only want MT code in the CPS SMP boot
vector if we're using MT for SMP. Thus switch the config symbol we ifdef
against to CONFIG_MIPS_MT_SMP.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: <stable@vger.kernel.org> # 3.16+
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10867/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-30 18:16:02 +02:00
Paul Burton
a5b0f6db0e MIPS: CPS: Don't include MT code in non-MT kernels.
The MT-specific code in mips_cps_boot_vpes can safely be omitted from
kernels which don't support MT, with the default VPE==0 case being used
as it would be after the has_mt (Config3.MT) check failed at runtime.
Discarding the code entirely will save us a few bytes & allow cleaner
handling of MT ASE instructions by later patches.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: <stable@vger.kernel.org> # 3.16+
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10866/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-30 18:15:51 +02:00
Paul Burton
1e5fb282f8 MIPS: CPS: Stop dangling delay slot from has_mt.
The has_mt macro ended with a branch, leaving its callers with a delay
slot that would be executed if Config3.MT is not set. However it would
not be executed if Config3 (or earlier Config registers) don't exist
which makes it somewhat inconsistent at best. Fill the delay slot in the
macro & fix the mips_cps_boot_vpes caller appropriately.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: <stable@vger.kernel.org> # 3.16+
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10865/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-30 18:15:29 +02:00
James Hogan
53960059d5 MIPS: dma-default: Fix 32-bit fall back to GFP_DMA
If there is a DMA zone (usually 24bit = 16MB I believe), but no DMA32
zone, as is the case for some 32-bit kernels, then massage_gfp_flags()
will cause DMA memory allocated for devices with a 32..63-bit
coherent_dma_mask to fall back to using __GFP_DMA, even though there may
only be 32-bits of physical address available anyway.

Correct that case to compare against a mask the size of phys_addr_t
instead of always using a 64-bit mask.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Fixes: a2e715a86c ("MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.")
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 2.6.36+
Patchwork: https://patchwork.linux-mips.org/patch/9610/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-30 16:27:39 +02:00
Martin Schwidefsky
72d38b1978 s390/vtime: correct scaled cputime of partially idle CPUs
The calculation for the SMT scaling factor for a hardware thread
which has been partially idle needs to disregard the cycles spent
by the other threads of the core while the thread is idle.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-30 16:22:38 +02:00
Ralf Baechle
96fc7a9cee MIPS: Wire up userfaultfd and membarrier syscalls.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-30 14:24:31 +02:00
Andrey Ryabinin
4ac86a6dce x86, efi, kasan: Fix build failure on !KASAN && KMEMCHECK=y kernels
With KMEMCHECK=y, KASAN=n we get this build failure:

  arch/x86/platform/efi/efi.c:673:3:    error: implicit declaration of function ‘memcpy’ [-Werror=implicit-function-declaration]
  arch/x86/platform/efi/efi_64.c:139:2: error: implicit declaration of function ‘memcpy’ [-Werror=implicit-function-declaration]
  arch/x86/include/asm/desc.h:121:2:    error: implicit declaration of function ‘memcpy’ [-Werror=implicit-function-declaration]

Don't #undef memcpy if KASAN=n.

Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 769a8089c1 ("x86, efi, kasan: #undef memset/memcpy/memmove per arch")
Link: http://lkml.kernel.org/r/1443544814-20122-1-git-send-email-ryabinin.a.a@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-30 09:29:53 +02:00
Ingo Molnar
ccf79c238f Linux 4.3-rc3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWB9f6AAoJEHm+PkMAQRiGiFMIAJYFLIkF/dXFYMNPGsRGRGYO
 SsQkfYzjy4i/yloyVlGB33e6dqxWdVgCeqYC77TO+1CBq34o6dqM4PACTrhjtS+3
 qQvaP/qn6cSoaGIkdD3v43CCiwMpZZ5+Uj7F7Uz8N4twrpykOZFMM5T7f1lrsG2F
 wJGafmvok9NU2F2wYwaJ8JrzsF6iO6ibFeB8BosRF5Ba4nKqiXVI0xNa0R8PFDm3
 tbh/IkkqokemEqnHyWyszhGFsCQupi+QgsjY/LhWUcCaL7HLEgJmkBX0tXNlgMmK
 TFCq7L8Bigu4nlgZ/iVUB9kh4GTBNVcbdRVN3loJFlczFJlIAa171OVlfRu3lvU=
 =m29x
 -----END PGP SIGNATURE-----

Merge tag 'v4.3-rc3' into x86/urgent, before applying dependent fix

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-30 09:29:27 +02:00
Kukjin Kim
4776dbb358 Fixes for Exynos (DT and mach code):
1. Finally fix booting of all 8 cores on Exynos Octa (Exynos542x): all
    8 cores are booting and can be used. The fix, based on vendor
    code and bootloader behavior, is as for time being only
    for MCPM enabled path.
 2. Fix thermal boot issue on SMDK5250.
 3. Fix invalid clock used for FIMD IOMMU.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWCjReAAoJEME3ZuaGi4PXJ4QP/33VbOcpUaO5vONfWlpKkqIy
 qiIXapj/LuJ0GVJTC2If5Ivveiafs1oQXCPOR01ZsobxUtCNFVoaSbp+gBR6bvW0
 SZ8oCF2SF9Rmgi62WzlJOZDrpiBLz+vqKqjp3VFkyHkOdLX0NucclWEWn+5EBOiK
 1fOdaM8Fcqcg98CU1yswnt/D0KJBK8GTS19xMuPYl53WbNsMgeoyQsWpf+z+c2Fi
 JO1GmH0f0wTlP/KSdNpyCn3LaoZzBfvPxkLV/hvpSPfiawVTWopg8wyVL3Hf1Qno
 lrePJlJHKfVQW08NdCar2fmos7moPQcy9/5qXvTJavFmgu6erWQqgLDBFL7t63qI
 jyhqo6CLVo9D7OgfQrWqqToBrANF37EsdSrRKaI8D1ELe9aZf0wlWe6Bk9tPTQHQ
 EhdL2VLI5pz0v8DClKiFoycG+x95UddiLaagP2RC19VZ1OmiMtE9KCIiPYyGPs+e
 Xzols4YRpolnxmSw9hO3P+DcusxRyr6QiPboQexizbiufC2bnNOEHtAv5qeCRo5l
 Uw6Gkzy9p3T6T8quAzAcriT9o+M5Fv5MvaBeSFyZ3C154YA+KJAFVJbfuvP3X1Rc
 N1DgFB4AwWNjSlv34lJ+0T6lHAnqV1C7QtedA999raitOAkIh72zWsRX+F7L7mRO
 NQbaNdjvp6Nc270f2+4D
 =Xcnj
 -----END PGP SIGNATURE-----

Merge tag 'samsung-fixes-4.3' of http://github.com/krzk/linux into v4.3-samsung-fixes

Fixes for Exynos (DT and mach code):
1. Finally fix booting of all 8 cores on Exynos Octa (Exynos542x): all
   8 cores are booting and can be used. The fix, based on vendor
   code and bootloader behavior, is as for time being only
   for MCPM enabled path.
2. Fix thermal boot issue on SMDK5250.
3. Fix invalid clock used for FIMD IOMMU.
2015-09-30 15:42:39 +09:00
Vitaly Kuznetsov
1e034743e9 x86/hyperv: Fix the build in the !CONFIG_KEXEC_CORE case
Recent changes in the Hyper-V driver:

  b4370df2b1 ("Drivers: hv: vmbus: add special crash handler")

broke the build when CONFIG_KEXEC_CORE is not set:

  arch/x86/built-in.o: In function `hv_machine_crash_shutdown':
  arch/x86/kernel/cpu/mshyperv.c:112: undefined reference to `native_machine_crash_shutdown'

Decorate all kexec related code with #ifdef CONFIG_KEXEC_CORE.

Reported-by: Jim Davis <jim.epost@gmail.com>
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: devel@linuxdriverproject.org
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1443002577-25370-1-git-send-email-vkuznets@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-30 07:44:15 +02:00
Fabio Estevam
178b2d09af ARM: dts: imx7d: Fix UART2 base address
The UART2 memory space starts at address 0x30890000 (UART2_URXD).

Fix it so that UART2 can be used.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Fixes: 9496734502 ("ARM: dts: add imx7d soc dtsi file")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2015-09-30 09:48:21 +08:00
Michael Ellerman
4fa9a3f6b6 powerpc/ps3: Remove unused os_area_db_id_video_mode
This struct is unused, which is now a build error with gcc 6:

  error: 'os_area_db_id_video_mode' defined but not used

There doesn't seem to be any good reason to keep it around so remove it,
it's in the history if anyone needs it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-30 10:33:02 +10:00
Christian Borntraeger
adc0b7fbf6 s390/boot/decompression: disable floating point in decompressor
my gcc 5.1 used an ldgr instruction with a register != 0,2,4,6 for
spilling/filling into a floating point register in our decompressor.

This will cause an AFP-register data exception as the decompressor
did not setup the additional floating point registers via cr0.
That causes a program check loop that looked like a hang with
one "Uncompressing Linux... " message (directly booted via kvm)
or a loop of "Uncompressing Linux... " messages (when booted via
zipl boot loader).

The offending code in my build was

   48e400:       e3 c0 af ff ff 71       lay     %r12,-1(%r10)
-->48e406:       b3 c1 00 1c             ldgr    %f1,%r12
   48e40a:       ec 6c 01 22 02 7f       clij    %r6,2,12,0x48e64e

but gcc could do spilling into an fpr at any function. We can
simply disable floating point support at that early stage.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: stable@vger.kernel.org
2015-09-29 14:45:10 +02:00
Michael Ellerman
6b98f2bec5 powerpc/configs: Re-enable CONFIG_SCSI_DH
Commit 086b91d052 ("scsi_dh: integrate into the core SCSI code")
changed CONFIG_SCSI_DH from tristate to bool.

Our defconfigs have CONFIG_SCSI_DH=m, which the kconfig machinery warns
us is invalid, but instead of converting it to =y it leaves it unset.
This means we loose the CONFIG_SCSI_DH code and everything that depends
on it.

So convert the values in the defconfigs to =y.

Fixes: 086b91d052 ("scsi_dh: integrate into the core SCSI code")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-29 20:34:03 +10:00
Robert Richter
9410097074 irqchip/gicv3-its: Workaround for Cavium ThunderX errata 22375, 24313
This implements two gicv3-its errata workarounds for ThunderX. Both
with small impact affecting only ITS table allocation.

 erratum 22375: only alloc 8MB table size
 erratum 24313: ignore memory access type

The fixes are in ITS initialization and basically ignore memory access
type and table size provided by the TYPER and BASER registers.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Signed-off-by: Robert Richter <rrichter@cavium.com>
Reviewed-by: Marc Zygnier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1442869119-1814-6-git-send-email-rric@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-29 10:10:54 +02:00
Robert Richter
6d4e11c5e2 irqchip/gicv3: Workaround for Cavium ThunderX erratum 23154
This patch implements Cavium ThunderX erratum 23154.

The gicv3 of ThunderX requires a modified version for reading the IAR
status to ensure data synchronization. Since this is in the fast-path
and called with each interrupt, runtime patching is used using jump
label patching for smallest overhead (no-op). This is the same
technique as used for tracepoints.

Signed-off-by: Robert Richter <rrichter@cavium.com>
Reviewed-by: Marc Zygnier <marc.zyngier@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1442869119-1814-3-git-send-email-rric@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-29 10:10:53 +02:00
Joonyoung Shim
c7d2ecd9f6 ARM: dts: Fix wrong clock binding for sysmmu_fimd1_1 on exynos5420
The sysmmu_fimd1_1 should bind the clock CLK_SMMU_FIMD1M1, not the clock
CLK_SMMU_FIMD1M0. CLK_SMMU_FIMD1M0 is a clock for the sysmmu_fimd1_0.

This wrong clock binding causes the problem that is blocked in iommu_map
function when IOMMU is enabled and exynos-drm driver tries to allocate
buffer via DMA mapping API on Odroid-XU3 board.

Fixes: b700451678 ("ARM: dts: add sysmmu nodes for exynos5420")
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Cc: <stable@vger.kernel.org> # v4.2
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2015-09-29 15:39:58 +09:00
Yadwinder Singh Brar
f404e7a730 ARM: dts: Fix bootup thermal issue on smdk5250
With default config on smdk5250 latest tree throws below message:

[    2.226049] thermal thermal_zone0: critical temperature reached(224 C),shutting down
[    2.227840] reboot: Failed to start orderly shutdown: forcing the issue

and hangs randomly because it reads wrong temperature value.

I can't figure out any direct relation between LDO10 and TMU from board
schematics which I have. So making LDO10 always-on to fix issue for now.

Signed-off-by: Yadwinder Singh Brar <yadi.brar01@gmail.com>
[pankaj.dubey: resubmitted after rebasing to latest kgene tree]
Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Tested-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2015-09-29 15:33:29 +09:00
Geert Uytterhoeven
56e86dd4bb ARM: shmobile: r8a7791 dtsi: Add CPG/MSTP Clock Domain for sound
797a0626e0 ("ARM: shmobile: r8a7791 dtsi: Add CPG/MSTP Clock Domain")
added CPG/MSTP clock-cells domain support, but it was missing sound
support. This patch adds it.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
[horms: updated commit id referred to in changelog]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2015-09-29 09:44:13 +09:00
Geert Uytterhoeven
6507c4efd2 ARM: shmobile: r8a7790 dtsi: Add CPG/MSTP Clock Domain for sound
484adb0058 ("ARM: shmobile: r8a7790 dtsi: Add CPG/MSTP Clock Domain")
added CPG/MSTP clock-cells domain support, but it was missing sound
support. This patch adds it.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
[horms: Updated commit id referred to in changelog]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
2015-09-29 09:44:13 +09:00
Olof Johansson
cb311494bb The i.MX fixes for 4.3:
- One fix for i.MX6 Rex Pro board to remove a duplicated pinctrl entry
    which causes error for USB pinctrl setup.
  - Fix MC34708 PMIC interrupt level for imx53-qsrb board.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWCM4UAAoJEFBXWFqHsHzOfEIIAJUXni9UzPgZ77RvX8tn8jCU
 E3GBEDXlTHmUaTMiCSIZaB00X/rEFGO/GMd1zUFxGZW3yTX4FJ/pCn3dNbFqmYLY
 T2v91D2r7Fbyugm8hGnbHoPlmk4RkyJ0oSIBGaR2PXnbrKYJHYgDBxUjsKH012LS
 2+ea3EF5bJ3eCtAI50GQwZXJc1qAJ9XoMSyhIoJXVFtwJ65JPdM1ntn/yBYEZD01
 uBnVwPw7EQzuPYwzdmj38BWPxJUxdDSj6SvZGtK2MVWl8lUvcHRiNcEX4lFMBMVi
 6vs7xQ7T9YnAh4jVjsAHWb9qjFtomLkuDhkClvUJHSHPU+Ue++iYSofwU6GnM5Y=
 =tm7i
 -----END PGP SIGNATURE-----

Merge tag 'imx-fixes-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes

The i.MX fixes for 4.3:
 - One fix for i.MX6 Rex Pro board to remove a duplicated pinctrl entry
   which causes error for USB pinctrl setup.
 - Fix MC34708 PMIC interrupt level for imx53-qsrb board.

* tag 'imx-fixes-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: fix usb pin control for imx-rex dts
  ARM: imx53: qsrb: fix PMIC interrupt level
  ARM: imx53: include IRQ dt-bindings header

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-09-28 16:30:19 -07:00
Malcolm Crossley
64c98e7f49 x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map
Sanitizing the e820 map may produce extra E820 entries which would result in
the topmost E820 entries being removed. The removed entries would typically
include the top E820 usable RAM region and thus result in the domain having
signicantly less RAM available to it.

Fix by allowing sanitize_e820_map to use the full size of the allocated E820
array.

Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-28 18:03:38 +01:00
Linus Torvalds
3225031fbe Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull arch/tile bugfix from Chris Metcalf:
 "This fixes a bug in 'make allmodconfig'"

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: fix build failure
2015-09-28 12:27:18 -04:00
Sudip Mukherjee
3a48d13d76 tile: fix build failure
When building with allmodconfig the build was failing with the error:

arch/tile/kernel/usb.c:70:1: warning: data definition has no type or storage class [enabled by default]
arch/tile/kernel/usb.c:70:1: error: type defaults to 'int' in declaration of 'arch_initcall' [-Werror=implicit-int]
arch/tile/kernel/usb.c:70:1: warning: parameter names (without types) in function declaration [enabled by default]
arch/tile/kernel/usb.c:63:19: warning: 'tilegx_usb_init' defined but not used [-Wunused-function]

Include linux/module.h to resolve the build failure.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
2015-09-28 11:23:39 -04:00
Vitaly Kuznetsov
0b34a166f2 x86/xen: Support kexec/kdump in HVM guests by doing a soft reset
Currently there is a number of issues preventing PVHVM Xen guests from
doing successful kexec/kdump:

  - Bound event channels.
  - Registered vcpu_info.
  - PIRQ/emuirq mappings.
  - shared_info frame after XENMAPSPACE_shared_info operation.
  - Active grant mappings.

Basically, newly booted kernel stumbles upon already set up Xen
interfaces and there is no way to reestablish them. In Xen-4.7 a new
feature called 'soft reset' is coming. A guest performing kexec/kdump
operation is supposed to call SCHEDOP_shutdown hypercall with
SHUTDOWN_soft_reset reason before jumping to new kernel. Hypervisor
(with some help from toolstack) will do full domain cleanup (but
keeping its memory and vCPU contexts intact) returning the guest to
the state it had when it was first booted and thus allowing it to
start over.

Doing SHUTDOWN_soft_reset on Xen hypervisors which don't support it is
probably OK as by default all unknown shutdown reasons cause domain
destroy with a message in toolstack log: 'Unknown shutdown reason code
5. Destroying domain.'  which gives a clue to what the problem is and
eliminates false expectations.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-28 14:48:52 +01:00
Boris Ostrovsky
2ecf91b6d8 xen/x86: Don't try to write syscall-related MSRs for PV guests
For PV guests these registers are set up by hypervisor and thus
should not be written by the guest. The comment in xen_write_msr_safe()
says so but we still write the MSRs, causing the hypervisor to
print a warning.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-28 14:48:52 +01:00
Juergen Gross
24f775a660 xen: use correct type for HYPERVISOR_memory_op()
HYPERVISOR_memory_op() is defined to return an "int" value. This is
wrong, as the Xen hypervisor will return "long".

The sub-function XENMEM_maximum_reservation returns the maximum
number of pages for the current domain. An int will overflow for a
domain configured with 8TB of memory or more.

Correct this by using the correct type.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-09-28 14:48:52 +01:00
Russell King
868e87ccda ARM: make RiscPC depend on MMU
RiscPC fails to build if MMU is disabled:

arch/arm/mach-rpc/ecard.c: In function 'ecard_init_pgtables':
arch/arm/mach-rpc/ecard.c:229:2: error: implicit declaration of function 'pgd_offset' [-Werror=implicit-function-declaration]

arrange for RiscPC to depend on MMU.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-28 12:34:15 +01:00
Radim Krčmář
9bac175d8e Revert "KVM: x86: zero kvmclock_offset when vcpu0 initializes kvmclock system MSR"
Shifting pvclock_vcpu_time_info.system_time on write to KVM system time
MSR is a change of ABI.  Probably only 2.6.16 based SLES 10 breaks due
to its custom enhancements to kvmclock, but KVM never declared the MSR
only for one-shot initialization.  (Doc says that only one write is
needed.)

This reverts commit b7e60c5aed.
And adds a note to the definition of PVCLOCK_COUNTS_FROM_ZERO.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-28 13:06:37 +02:00
Ashok Raj
6e06780a98 x86/mce: Don't clear shared banks on Intel when offlining CPUs
It is not safe to clear global MCi_CTL banks during CPU offline
or suspend/resume operations. These MSRs are either
thread-scoped (meaning private to a thread), or core-scoped
(private to threads in that core only), or with a socket scope:
visible and controllable from all threads in the socket.

When we offline a single CPU, clearing those MCi_CTL bits will
stop signaling for all the shared, i.e., socket-wide resources,
such as LLC, iMC, etc.

In addition, it might be possible to compromise the integrity of
an Intel Secure Guard eXtentions (SGX) system if the attacker
has control of the host system and is able to inject errors
which would be otherwise ignored when MCi_CTL bits are cleared.

Hence on SGX enabled systems, if MCi_CTL is cleared, SGX gets
disabled.

Tested-by: Serge Ayoun <serge.ayoun@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
[ Cleanup text. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1441391390-16985-1-git-send-email-ashok.raj@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-28 10:15:26 +02:00
Denys Vlasenko
0d44975d1e x86/kgdb: Replace bool_int_array[NR_CPUS] with bitmap
Straigntforward conversion from:

    int was_in_debug_nmi[NR_CPUS]

to:

    DECLARE_BITMAP(was_in_debug_nmi, NR_CPUS)

Saves about 2 kbytes in BSS for NR_CPUS=512.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443271638-2568-1-git-send-email-dvlasenk@redhat.com
[ Tidied up the code a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-28 10:13:31 +02:00
Geert Uytterhoeven
95bc06ef04 m68k/defconfig: Update defconfigs for v4.3-rc1
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2015-09-28 10:01:09 +02:00
Andreas Schwab
8474ba7419 m68k: Define asmlinkage_protect
Make sure the compiler does not modify arguments of syscall functions.
This can happen if the compiler generates a tailcall to another
function.  For example, without asmlinkage_protect sys_openat is compiled
into this function:

sys_openat:
	clr.l %d0
	move.w 18(%sp),%d0
	move.l %d0,16(%sp)
	jbra do_sys_open

Note how the fourth argument is modified in place, modifying the register
%d4 that gets restored from this stack slot when the function returns to
user-space.  The caller may expect the register to be unmodified across
system calls.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org
2015-09-28 09:59:45 +02:00
Geert Uytterhoeven
7f843dab13 m68k: Wire up membarrier
$ ./membarrier_test
membarrier MEMBARRIER_CMD_QUERY syscall available.
membarrier: MEMBARRIER_CMD_SHARED success.
membarrier: tests done!
$

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2015-09-28 09:59:44 +02:00
Geert Uytterhoeven
b92858f2be m68k: Wire up userfaultfd
$ ./userfaultfd 10 99
nr_pages: 2560, nr_pages_per_cpu: 2560
bounces: 98, mode: racing, userfaults: 1121
bounces: 97, mode: rnd, userfaults: 977
bounces: 96, mode:, userfaults: 1119
bounces: 95, mode: rnd racing ver poll, userfaults: 1040
bounces: 94, mode: racing ver poll, userfaults: 1022
bounces: 93, mode: rnd ver poll, userfaults: 946
bounces: 92, mode: ver poll, userfaults: 1115
bounces: 91, mode: rnd racing poll, userfaults: 977
bounces: 90, mode: racing poll, userfaults: 899
bounces: 89, mode: rnd poll, userfaults: 881
bounces: 88, mode: poll, userfaults: 1069
bounces: 87, mode: rnd racing ver, userfaults: 1114
bounces: 86, mode: racing ver, userfaults: 1109
bounces: 85, mode: rnd ver, userfaults: 1165
bounces: 84, mode: ver, userfaults: 1107
bounces: 83, mode: rnd racing, userfaults: 1134
bounces: 82, mode: racing, userfaults: 1105
bounces: 81, mode: rnd, userfaults: 1323
bounces: 80, mode:, userfaults: 1103
bounces: 79, mode: rnd racing ver poll, userfaults: 909
bounces: 78, mode: racing ver poll, userfaults: 1095
bounces: 77, mode: rnd ver poll, userfaults: 951
bounces: 76, mode: ver poll, userfaults: 1099
bounces: 75, mode: rnd racing poll, userfaults: 1035
bounces: 74, mode: racing poll, userfaults: 1097
bounces: 73, mode: rnd poll, userfaults: 1159
bounces: 72, mode: poll, userfaults: 1042
bounces: 71, mode: rnd racing ver, userfaults: 848
bounces: 70, mode: racing ver, userfaults: 1093
bounces: 69, mode: rnd ver, userfaults: 892
bounces: 68, mode: ver, userfaults: 1091
bounces: 67, mode: rnd racing, userfaults: 1219
bounces: 66, mode: racing, userfaults: 1089
bounces: 65, mode: rnd, userfaults: 988
bounces: 64, mode:, userfaults: 1087
bounces: 63, mode: rnd racing ver poll, userfaults: 882
bounces: 62, mode: racing ver poll, userfaults: 984
bounces: 61, mode: rnd ver poll, userfaults: 701
bounces: 60, mode: ver poll, userfaults: 1071
bounces: 59, mode: rnd racing poll, userfaults: 1137
bounces: 58, mode: racing poll, userfaults: 1032
bounces: 57, mode: rnd poll, userfaults: 911
bounces: 56, mode: poll, userfaults: 1079
bounces: 55, mode: rnd racing ver, userfaults: 1106
bounces: 54, mode: racing ver, userfaults: 1077
bounces: 53, mode: rnd ver, userfaults: 886
bounces: 52, mode: ver, userfaults: 1075
bounces: 51, mode: rnd racing, userfaults: 1101
bounces: 50, mode: racing, userfaults: 1073
bounces: 49, mode: rnd, userfaults: 1070
bounces: 48, mode:, userfaults: 1071
bounces: 47, mode: rnd racing ver poll, userfaults: 1077
bounces: 46, mode: racing ver poll, userfaults: 910
bounces: 45, mode: rnd ver poll, userfaults: 1063
bounces: 44, mode: ver poll, userfaults: 1028
bounces: 43, mode: rnd racing poll, userfaults: 1043
bounces: 42, mode: racing poll, userfaults: 1065
bounces: 41, mode: rnd poll, userfaults: 912
bounces: 40, mode: poll, userfaults: 1063
bounces: 39, mode: rnd racing ver, userfaults: 880
bounces: 38, mode: racing ver, userfaults: 1061
bounces: 37, mode: rnd ver, userfaults: 1144
bounces: 36, mode: ver, userfaults: 1059
bounces: 35, mode: rnd racing, userfaults: 967
bounces: 34, mode: racing, userfaults: 1057
bounces: 33, mode: rnd, userfaults: 1076
bounces: 32, mode:, userfaults: 1055
bounces: 31, mode: rnd racing ver poll, userfaults: 997
bounces: 30, mode: racing ver poll, userfaults: 1053
bounces: 29, mode: rnd ver poll, userfaults: 968
bounces: 28, mode: ver poll, userfaults: 978
bounces: 27, mode: rnd racing poll, userfaults: 1008
bounces: 26, mode: racing poll, userfaults: 1049
bounces: 25, mode: rnd poll, userfaults: 900
bounces: 24, mode: poll, userfaults: 1047
bounces: 23, mode: rnd racing ver, userfaults: 988
bounces: 22, mode: racing ver, userfaults: 1045
bounces: 21, mode: rnd ver, userfaults: 1027
bounces: 20, mode: ver, userfaults: 1043
bounces: 19, mode: rnd racing, userfaults: 1017
bounces: 18, mode: racing, userfaults: 1041
bounces: 17, mode: rnd, userfaults: 979
bounces: 16, mode:, userfaults: 1039
bounces: 15, mode: rnd racing ver poll, userfaults: 1134
bounces: 14, mode: racing ver poll, userfaults: 1037
bounces: 13, mode: rnd ver poll, userfaults: 1046
bounces: 12, mode: ver poll, userfaults: 1035
bounces: 11, mode: rnd racing poll, userfaults: 1060
bounces: 10, mode: racing poll, userfaults: 1033
bounces: 9, mode: rnd poll, userfaults: 1003
bounces: 8, mode: poll, userfaults: 929
bounces: 7, mode: rnd racing ver, userfaults: 964
bounces: 6, mode: racing ver, userfaults: 1029
bounces: 5, mode: rnd ver, userfaults: 1053
bounces: 4, mode: ver, userfaults: 1027
bounces: 3, mode: rnd racing, userfaults: 863
bounces: 2, mode: racing, userfaults: 1025
bounces: 1, mode: rnd, userfaults: 1043
bounces: 0, mode:, userfaults: 950

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2015-09-28 09:59:44 +02:00
Geert Uytterhoeven
5b3f33eb40 m68k: Wire up direct socket calls
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2015-09-28 09:59:43 +02:00
Geliang Tang
18ab2cd3ee perf/core, perf/x86: Change needlessly global functions and a variable to static
Fixes various sparse warnings.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/70c14234da1bed6e3e67b9c419e2d5e376ab4f32.1443367286.git.geliangtang@163.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-28 08:09:52 +02:00
Ingo Molnar
6afc0c269c Merge branch 'linus' into perf/core, to pick up fixes before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-28 08:06:57 +02:00
Linus Torvalds
097f70b3c4 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 - Properly setup irq handling for ATH79 platforms
 - Fix bootmem mapstart calculation for contiguous maps
 - Handle little endian and older CPUs correct in BPF
 - Fix console for Fulong 2E systems
 - Handle FTLB correctly on R6 CPUs
 - Fixes for CM, GIC and MAAR support code

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Initialise MAARs on secondary CPUs
  MIPS: print MAAR configuration during boot
  MIPS: mm: compile maar_init unconditionally
  irqchip: mips-gic: Fix pending & mask reads for MIPS64 with 32b GIC.
  irqchip: mips-gic: Convert CPU numbers to VP IDs.
  MIPS: CM: Provide a function to map from CPU to VP ID.
  MIPS: Fix FTLB detection for R6
  MIPS: cpu-features: Add cpu_has_ftlb
  MIPS: ATH79: Add irq chip ar7240-misc-intc
  MIPS: ATH79: Set missing irq ack handler for ar7100-misc-intc irq chip
  MIPS: BPF: Fix build on pre-R2 little endian CPUs
  MIPS: BPF: Avoid unreachable code on little endian
  MIPS: bootmem: Fix mapstart calculation for contiguous maps
  MIPS: Fix console output for Fulong2e system
2015-09-27 18:22:34 -04:00
Linus Torvalds
e3be4266d3 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "Another pile of fixes for perf:

   - Plug overflows and races in the core code

   - Sanitize the flow of the perf syscall so we error out before
     handling the more complex and hard to undo setups

   - Improve and fix Broadwell and Skylake hardware support

   - Revert a fix which broke what it tried to fix in perf tools

   - A couple of smaller fixes in various places of perf tools"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Fix copying of /proc/kcore
  perf intel-pt: Remove no_force_psb from documentation
  perf probe: Use existing routine to look for a kernel module by dso->short_name
  perf/x86: Change test_aperfmperf() and test_intel() to static
  tools lib traceevent: Fix string handling in heterogeneous arch environments
  perf record: Avoid infinite loop at buildid processing with no samples
  perf: Fix races in computing the header sizes
  perf: Fix u16 overflows
  perf: Restructure perf syscall point of no return
  perf/x86/intel: Fix Skylake FRONTEND MSR extrareg mask
  perf/x86/intel/pebs: Add PEBS frontend profiling for Skylake
  perf/x86/intel: Make the CYCLE_ACTIVITY.* constraint on Broadwell more specific
  perf tools: Bool functions shouldn't return -1
  tools build: Add test for presence of __get_cpuid() gcc builtin
  tools build: Add test for presence of numa_num_possible_cpus() in libnuma
  Revert "perf symbols: Fix mismatched declarations for elf_getphdrnum"
  perf stat: Fix per-pkg event reporting bug
2015-09-27 12:51:39 -04:00
Paul Burton
e060f6ed28 MIPS: Initialise MAARs on secondary CPUs
MAARs should be initialised on each CPU (or rather, core) in the system
in order to achieve consistent behaviour & performance. Previously they
have only been initialised on the boot CPU which leads to performance
problems if tasks are later scheduled on a secondary CPU, particularly
if those tasks make use of unaligned vector accesses where some CPUs
don't handle any cases in hardware for non-speculative memory regions.
Fix this by recording the MAAR configuration from the boot CPU and
applying it to secondary CPUs as part of their bringup.

Reported-by: Doug Gilmore <doug.gilmore@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Andrew Bresticker <abrestic@chromium.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Hemmo Nieminen <hemmo.nieminen@iki.fi>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11239/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-27 14:15:26 +02:00
Paul Burton
651ca7f4da MIPS: print MAAR configuration during boot
Verifying that the MAAR configuration is as expected is useful when
debugging the performance of a system. Print out the memory regions
configured via MAAR along with their attributes.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11238/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-27 14:14:47 +02:00
Paul Burton
def3ab5d0a MIPS: mm: compile maar_init unconditionally
maar_init was previously only compiled when CONFIG_NEED_MULTIPLE_NODES
was not set, which has been fine since it is only called from the
standard implementation of mem_init which has the same condition. In
preparation for calling it from the SMP startup code on secondary CPUs,
move maar_init outside of the #ifndef such that it is always compiled.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/11237/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-27 14:14:00 +02:00
Paul Burton
7573b94e08 MIPS: CM: Provide a function to map from CPU to VP ID.
The VP ID of a given CPU may not match up with the CPU number used by
Linux. For example, if the width of the VP part of the VP ID is wider
than log2(number of VPs per core) and the system has multiple cores then
this will be the case. Alternatively, if a pre-r6 system implements the
MT ASE with multiple VPEs per core and Linux is built without support
for the MT ASE then the numbers won't match up either. Provide a
function to convert from CPU number to VP ID.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Patchwork: https://patchwork.linux-mips.org/patch/11211/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-09-27 14:11:17 +02:00
Linus Torvalds
162e6df47c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "Two bugfixes from Andy addressing at least some of the subtle NMI
  related wreckage which has been reported by Sasha Levin"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/nmi/64: Fix a paravirt stack-clobbering bug in the NMI code
  x86/paravirt: Replace the paravirt nop with a bona fide empty function
2015-09-27 06:51:42 -04:00
Linus Torvalds
c905929ac9 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Just two fixes: wire up the new system calls added during the last
  merge window, and fix another user access site"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: alignment: fix alignment handling for uaccess changes
  ARM: wire up new syscalls
2015-09-27 06:48:48 -04:00
Linus Torvalds
685b5f1de6 ARM: SoC fixes for v4.3-rc
Our first real batch of fixes this release cycle. There's a collection of
 them here:
 
 - A fixup for a build breakage that hits on arm64 allmodconfig in QCOM SCM
   firmware drivers
 - MMC fixes for OMAP that had quite a bit of breakage this merge window.
 - Misc build/warning fixes on PXA and OMAP
 - A couple of minor fixes for Beagleboard X15 which is now starting to see
   a few more users in the wild
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWB3+pAAoJEIwa5zzehBx36sAP/idEfXiZJiriXSLcU1AkWo23
 PAeIiIpENn6tbYrI0ogN+I63uNE9TkrWP/us9ZlPqQJOG29DVyol0YuAEmT3VSeo
 hColJInE2450fPxFw7hKOWnQ2En30fI5cIHUHzDNJl1Tn2liE4K2FuenSLmf34KH
 XQ2VkNMjj8uM9C0UMy/Tescm3r4LYKk9NXVG+oWDkw1PVdFMsBIE1Vo7KLWGJ6Ta
 Ig6Ub2A2ag1usJjjaTNJsbU4WRxHk37/r+psDzyTTxhp9ulS0uer84K7pqW7AVWn
 NsTUI83z3grKvnQrlTNKu7WCJH4Q+Xgru05mV3yhEoza+X7RhMAWX7zAgq7D3fDX
 mRT4L5RLZJZ8GDsWS35BMxBOOi3uxfPUtT1k2YobJeQYKaHaE06S5K9BMbaPCV6M
 d7ShNGuESz9RRLRDQFYaCqhZHyYoWxS1o0gczoEqB9/piPS7Fv7rQ1tujEU+/o8r
 8uwN6zYmcUJJykn+NxPP6Qskc4vWT+nQaGOp7YKkUFrh6wgssIOlU1HoYfPssjbM
 A0LHFZ1vRNFxtdnPhSi9A5IvVg4ST2G47MSV46ifplWzGyJXbDxuBV1/sCxbfUFn
 FRXt5FakkkbhTm/PsC9Dd/CYBx1HMlGoAP6nvmMccmmvWKTfEzjkuyTDhPWGihZN
 ZSos4D3kdlKnEtRZYfrx
 =ThcO
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "Our first real batch of fixes this release cycle.  Nothing really
  concerning, and diffstat is a bit inflated due to some DT contents
  moving around on STi platforms.

  There's a collection of them here:

   - A fixup for a build breakage that hits on arm64 allmodconfig in
     QCOM SCM firmware drivers
   - MMC fixes for OMAP that had quite a bit of breakage this merge
     window.
   - Misc build/warning fixes on PXA and OMAP
   - A couple of minor fixes for Beagleboard X15 which is now starting
     to see a few more users in the wild"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (31 commits)
  ARM: sti: dt: adapt DT to fix probe/bind issues in DRM driver
  ARM: dts: fix omap2+ address translation for pbias
  firmware: qcom: scm: Add function stubs for ARM64
  ARM: dts: am57xx-beagle-x15: use palmas-usb for USB2
  ARM: omap2plus_defconfig: enable GPIO_PCA953X
  ARM: dts: omap5-uevm.dts: fix i2c5 pinctrl offsets
  ARM: OMAP2+: AM43XX: Enable autoidle for clks in am43xx_init_late
  ARM: dts: am57xx-beagle-x15: Update Phy supplies
  ARM: pxa: balloon3: Fix build error
  ARM: dts: Fixup model name for HP t410 dts
  ARM: dts: DRA7: fix a typo in ethernet
  ARM: omap2plus_defconfig: make PCF857x built-in
  ARM: dts: Use ti,pbias compatible string for pbias
  ARM: OMAP5: Cleanup options for SoC only build
  ARM: DRA7: Select missing options for SoC only build
  ARM: OMAP2+: board-generic: Remove stale of_irq macros
  ARM: OMAP4+: PM: erratum is used by OMAP5 and DRA7 as well
  ARM: dts: omap3-igep: Move eth IRQ pinmux to IGEPv2 common dtsi
  ARM: dts: am57xx-beagle-x15: Add wakeup irq for mcp79410
  ARM: dts: am335x-phycore-som: Fix mpu voltage
  ...
2015-09-27 06:45:18 -04:00
Olof Johansson
e46fc90ec2 ARM: pxa: fixes for v4.3
These fixes are mainly regression fixes triggered by irq changes,
 common clock framework introduction and sound side-effect of
 other platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV/nW0AAoJEAP2et0duMsSAP0QAKqFrk4GyRNTFxDQNBj43jrS
 3dHtbbTp04sf/RotcGbPPfGvWkv4LAc4CGYnMC3lXZw64HdIHtuHiiwrj3eeuoH9
 zw0xJpbzcHyu25i9wrOY06WgJ4TfTwqfBhhZZ9xoYs8XjuhbKxOP/X7ytSQ56aK9
 iyYCiE4sB4Ikgiu9iOvoztqo3SgtLuK7HYIgV5XASbVqtco4pLfq0upTbfkOesqw
 V9/+rnG5IaHAxgxuCT/UFg4uP1mW0W8EgC4UGi+Vi2MPQ5kmUwSSTmCcIYGryBem
 etJth5b8Q0AEnlEnx5plAhCPskpU7YkPCticQAmQkdU3fJC09+rzkgV66dOi64Mz
 f6YfGH/UXaHZjYZqIDRStN+Bp7HKJRBXNTtRUFfRFY40mKQFrcyrxPmYWIAiwY+E
 62O2/qzMMuycxlQo7xhH00qEK9oB2VfshhqXKRHWDhKyAjPbvhF6h0tnHBJSsPzi
 HJRtzs5dHtCqYBjVY21U391D7IA5lEck4cOExvVRbe1dn2eGbOZcLTS+caZbzhEy
 ulYbS4QZvtFTh7+dRhpG0jTuxIjMbZskogvOSET/wNuZewu7IrFMZt8ySPFbHwDs
 h5i+TZVfxcXu8kv+C5rT5q0bAP18p8dzowgzODh5iQpwTLahZ7tUPJmF+sC3QuSd
 zAOeAqKy3RQ8NL74LMPo
 =fNDY
 -----END PGP SIGNATURE-----

Merge tag 'pxa-fixes-v4.3' of https://github.com/rjarzmik/linux into fixes

ARM: pxa: fixes for v4.3

These fixes are mainly regression fixes triggered by irq changes,
common clock framework introduction and sound side-effect of
other platforms.

* tag 'pxa-fixes-v4.3' of https://github.com/rjarzmik/linux:
  ARM: pxa: balloon3: Fix build error
  ARM: pxa: ssp: Fix build error by removing originally incorrect DT binding
  ARM: pxa: fix DFI bus lockups on startup

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-09-26 22:23:26 -07:00
Olof Johansson
b8ba826f8d Fixes for omaps for v4.3-rc cycle:
- Two more patches to fix most of the MMC regressions with the
   PBIAS regulator changes. At least two MMC driver related issues
   still seems to remain for omap3 legacy booting and omap4 duovero.
   Note that the dts changes depend on a recent regulator fix, and
   are based on the regulator commit now in mainline kernel
 
 - Enable autoidle for am43xx clocks to prevent clocks from staying
   always on
 
 - Fix i2c5 pinctrl offsets for omap5-uevm
 
 - Enable PCA953X as that's needed for HDMI to work on omap5
 
 - Update phy supplies for beagle x15 beta board
 
 - Use palmas-usb for on beagle x15 to start using the related
   driver that recently got merged
 -
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBKnGAAoJEBvUPslcq6Vzps4QAIJGO+P3Fivcaso+1Spw41ib
 tTQUgF5dGX1It7HageKyd8mk9d5ywlZ3iFCJKuxdJAZfzSzdW6QdF/XKvH5K26nO
 +imDQV/WdGYcgKGQWvkMGbaWIcqxkSU1Wk/VWIlibXd875vVE5Koit5yvtQZTt19
 Xxk7Sj3ADCQ+WTgh+k/hFlvypjbKnWfV2HvCZ+tPPDHQA0IVejAfWn4u3jmcx6S9
 VY1CnU5NzQNNLmXaYwuUSKYFnipfrljsNvZfwLvVFYOfApWOAKpWzs2GdX161TJx
 oZdMhSVwJ9gBGRGhv33GXNouVmq4aEesXwg8M0fd8WpWWXVDI8SkEgALU9eghgGx
 Z41OuJIXS9udgqQdfwK2EUrynKKhQ/R1ywAM4SHyGCbs+FO3yAE8Gxgot/cnaGGx
 xJkog31VkHfG2ucGjBMaMJQ9oJfCtyT9tTdpHOnaKVrnn/7ZQwDbRR/npsTaA1Vg
 6EZvcYJpkmQ8Z/wYdMVvHrbzZ5GY/FRsRo8MwADmarLGRJ8aN+T3ZXik/cFmEqu+
 UN2OkOWn0i/OTdw8ED7pzvTMFr/mOZW1G51cj9W/uuVrA5nEmIWxrlVYm90aiE65
 Qz+f+qcTiVFFatpEq9CUhxsvOiDZ9gbBW130f/hmrkKm3PziAla3V1n+nOcO6/IF
 UBINOYy30yfDDccmbpnS
 =UnKL
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v4.3/fixes-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

Fixes for omaps for v4.3-rc cycle:

- Two more patches to fix most of the MMC regressions with the
  PBIAS regulator changes. At least two MMC driver related issues
  still seems to remain for omap3 legacy booting and omap4 duovero.
  Note that the dts changes depend on a recent regulator fix, and
  are based on the regulator commit now in mainline kernel

- Enable autoidle for am43xx clocks to prevent clocks from staying
  always on

- Fix i2c5 pinctrl offsets for omap5-uevm

- Enable PCA953X as that's needed for HDMI to work on omap5

- Update phy supplies for beagle x15 beta board

- Use palmas-usb for on beagle x15 to start using the related
  driver that recently got merged

* tag 'omap-for-v4.3/fixes-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: fix omap2+ address translation for pbias
  ARM: dts: am57xx-beagle-x15: use palmas-usb for USB2
  ARM: omap2plus_defconfig: enable GPIO_PCA953X
  ARM: dts: omap5-uevm.dts: fix i2c5 pinctrl offsets
  ARM: OMAP2+: AM43XX: Enable autoidle for clks in am43xx_init_late
  ARM: dts: am57xx-beagle-x15: Update Phy supplies
  regulator: pbias: program pbias register offset in pbias driver
  ARM: omap2plus_defconfig: Enable MUSB DMA support
  ARM: DRA752: Add ID detect for ES2.0
  ARM: OMAP3: vc: fix 'or' always true warning
  ARM: OMAP2+: Fix booting if no timer parent clock is available
  ARM: OMAP2+: omap-device: fix race deferred probe of omap_hsmmc vs omap_device_late_init

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-09-26 22:22:31 -07:00
Linus Torvalds
966966a630 PCI updates for v4.3:
Resource management
     - Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
     - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn Helgaas)
 
   MSI
     - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)
 
   Renesas R-Car host bridge driver
     - Add R8A7794 support (Sergei Shtylyov)
 
   Miscellaneous
     - Fix devfn for VPD access through function 0 (Alex Williamson)
     - Use function 0 VPD only for identical functions (Alex Williamson)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBUq2AAoJEFmIoMA60/r8wbwP/0D/+fKEPYJlB6hx1wLHpVk3
 K//vEwH0RgA3v2X53QUoHg94gTYhSZLKX0zdAFshbphE0HCZ6AO3UO+/ZJ3cui6J
 PYvKOnhby2ErNotZqrs3DQIM8rGgl0ZVgoFQrAWEvwiHRHI/r2ArK/oR4PiBjxJT
 StYuJoTkZIlJyHXza6tvHDcWi+Jc8t8r0HC4Vs32BlaVBQM0SH3CMxHfhJw/Q9xP
 WHFif1sH0N+p7WDyHH71C1T8POOgXY73BsD2AC0se3lRYZ9SVkOVy9ECGUucx8F6
 LDAuFelwRvW2Dr9kh38+5f8Xp155E+eZ6zRWW9/JlrUKVEtHhOFhtrRfDNKHuDCt
 B9ETrxDiSUFAdQ2weye9BK6aXK0CHF6YP3PCbvK77qFUUsN8csFSKktanKrFAbML
 CdjkVkEoeLHw+aXzyDg0pSBRZMQ24dTQDh7YqOFZGuEjCLPXOEQ8nitf0IzBB0KI
 4QetT/QK3bKkgtVKTwPP+s9f4g+fA/oiwJ21ZTV9hi/9upywTa/umCUvH9Fmb8Fp
 VZeTzugSht0+ioXpaF/6/KO0Ccp/t5uAHYeuBBMqiHX7ks8DdnfPCwbWNRKkg25O
 Qy7Y8VnnOtesRCAqBq5y/hHlLUluMkjYpEblYFiD6HBWcjUh6xE6LlIO4mwnjqWI
 zjB3w7+0GOrvS7dBSx0N
 =f4c3
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "These are fixes for things we merged for v4.3 (VPD, MSI, and bridge
  window management), and a new Renesas R8A7794 SoC device ID.

  Details:

  Resource management:
   - Revert pci_read_bridge_bases() unification (Bjorn Helgaas)
   - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn
     Helgaas)

  MSI:
   - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson)

  Renesas R-Car host bridge driver:
   - Add R8A7794 support (Sergei Shtylyov)

  Miscellaneous:
   - Fix devfn for VPD access through function 0 (Alex Williamson)
   - Use function 0 VPD only for identical functions (Alex Williamson)"

* tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: rcar: Add R8A7794 support
  PCI: Use function 0 VPD for identical functions, regular VPD for others
  PCI: Fix devfn for VPD access through function 0
  PCI/MSI: Fix MSI IRQ domains for VFs on virtual buses
  PCI: Clear IORESOURCE_UNSET when clipping a bridge window
  PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"
2015-09-25 11:16:53 -07:00
Linus Torvalds
b6d980f493 AMD fixes for bugs introduced in the 4.2 merge window,
and a few PPC bug fixes too.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWBSn7AAoJEL/70l94x66Dxd4H/RT6kWWj9x4grEYUkcJUDyK2
 AXm7XcKQm04auwAic8Otr+ts/Qix/50kWmBe/TU0QLgqb8rj5Dj3yGFK6Z1y6mAz
 KvaxqMJd4tZGTqN0DDvC2ItEdzjfAdeJZo/FHXqPHVspG0G14T7STLna02LTBBEJ
 tNzY9qor8nFhg2fT2szqKaudUNgTqkCTpo57o2BrHE96SHG+m0WdpQCV1F5hPVpg
 Te0Pb7qX9xng5n3sQ7IV/t3QYbrza1ACwNQS9XJa0Yu6iEz7JdmVmzHQASK9ynn6
 hUHhsNYGx4IsPjPtfJk2GroNaRDZL+VMzw07tfcOvPx8xkS9hS63pwzmSBqfLrM=
 =Ywqn
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC
  bug fixes too"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: disable halt_poll_ns as default for s390x
  KVM: x86: fix off-by-one in reserved bits check
  KVM: x86: use correct page table format to check nested page table reserved bits
  KVM: svm: do not call kvm_set_cr0 from init_vmcb
  KVM: x86: trap AMD MSRs for the TSeg base and mask
  KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store()
  KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit
  KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs
  kvm: svm: reset mmu on VCPU reset
2015-09-25 10:51:40 -07:00
Linus Torvalds
57cb635c5c powerpc fixes for 4.3 #2
- Wire up sys_membarrier()
  - cxl: Fix lockdep warning while creating afu_err_buff from Vaibhav
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBK25AAoJEFHr6jzI4aWAjpgP/2ZObQW0NToo88d3H2ERiKP9
 S6c8jAkq+NKEKiNyjQ2pf+rwruVjH/DYIMl6HE9wqyEfVtwPqIOjfaVZiXQG6fno
 +BPomCA1negqC3AvVuY9QELYD8rBcZ9nbUK1FeGwlZZySNteaSnz5xi/BvwTo1rm
 puJOSW4KCk/DpGnN6nuBVOJc7NKoGBgbTc3vXqyQVOF+Lu2BQlkfscbHgDnqVrT9
 R5tE6U6pQJVZFD15pxmp6dmib8ujoX0eVKFz89rzEsKTcDIBvnPTyjMYr3Y5Z5hS
 AhOKtunfZg6LOmeg+zd7u1FNwY3PL9ir59fWu5WUXIvqao67k04Li6eggzyZNQNV
 FT8gnj4pFzpNfv1Czm93Ki4dXd3uTYg02OuB3iPq3R3qKuL6cDwS2NGIl0y5iTqI
 kVVQp7u5UVdRdjCNgKmAe48kzVDzAR7B9OFdQBu0JE4NF107ubZKhpokEoagRyru
 CCe+WK1zXvj9S6UaX8f2Mg//oxTqz3jykxr5R2pHb9eFppuKFbj3hmF081OIrcBQ
 rQTR+3MMfEzqw19LaqLWRo70FSMSz8E79+vlHAjdSr5L7FY5YCmioWN231zq5W1R
 69ENkRaCLHdBLysitdrUXnf2uTcrwIRQuqkohG7JWys2+gz1/mqJUsa2rmXKAvuN
 JP3skFQMPlduIzSEy5rI
 =RQ5A
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 - Wire up sys_membarrier()
 - cxl: Fix lockdep warning while creating afu_err_buff from Vaibhav

* tag 'powerpc-4.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  cxl: Fix lockdep warning while creating afu_err_buff attribute
  powerpc: Wire up sys_membarrier()
2015-09-25 10:11:26 -07:00
Loc Ho
043cba9691 arm64, EDAC: Add L3/SoC DT subnodes to the APM X-Gene SoC EDAC node
Add L3/SoC DT subnodes to the APM X-Gene SoC EDAC node.

Signed-off-by: Loc Ho <lho@apm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: devicetree@vger.kernel.org
Cc: Duc Dang <dhdang@apm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Iyappan Subramanian <isubramanian@apm.com>
Cc: jcm@redhat.com
Cc: Keyur Chudgar <kchudgar@apm.com>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: mchehab@osg.samsung.com
Cc: patches@apm.com
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Rameshwar Prasad Sahu <rsahu@apm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Tanmay Inamdar <tinamdar@apm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Y Vo <yvo@apm.com>
Link: http://lkml.kernel.org/r/1443055261-8613-5-git-send-email-lho@apm.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 15:43:53 +02:00
David Hildenbrand
920552b213 KVM: disable halt_poll_ns as default for s390x
We observed some performance degradation on s390x with dynamic
halt polling. Until we can provide a proper fix, let's enable
halt_poll_ns as default only for supported architectures.

Architectures are now free to set their own halt_poll_ns
default value.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 10:31:30 +02:00
Paolo Bonzini
58c95070da KVM: x86: fix off-by-one in reserved bits check
29ecd66019 ("KVM: x86: avoid uninitialized variable warning",
2015-09-06) introduced a not-so-subtle problem, which probably
escaped review because it was not part of the patch context.

Before the patch, leaf was always equal to iterator.level.  After,
it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set,
and when is_shadow_zero_bits_set does another "-1" the check on
reserved bits becomes incorrect.  Using "iterator.level" in the call
fixes this call trace:

WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]()
Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq
[...]
Call Trace:
  dump_stack+0x4e/0x84
  warn_slowpath_common+0x95/0xe0
  warn_slowpath_null+0x1a/0x20
  handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]
  tdp_page_fault+0x231/0x290 [kvm]
  ? emulator_pio_in_out+0x6e/0xf0 [kvm]
  kvm_mmu_page_fault+0x36/0x240 [kvm]
  ? svm_set_cr0+0x95/0xc0 [kvm_amd]
  pf_interception+0xde/0x1d0 [kvm_amd]
  handle_exit+0x181/0xa70 [kvm_amd]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  ? preempt_count_sub+0x9b/0xf0
  ? mutex_lock_killable_nested+0x26f/0x490
  ? preempt_count_sub+0x9b/0xf0
  kvm_vcpu_ioctl+0x358/0x710 [kvm]
  ? __fget+0x5/0x210
  ? __fget+0x101/0x210
  do_vfs_ioctl+0x2f4/0x560
  ? __fget_light+0x29/0x90
  SyS_ioctl+0x4c/0x90
  entry_SYSCALL_64_fastpath+0x16/0x73
---[ end trace 37901c8686d84de6 ]---

Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 10:31:29 +02:00
Paolo Bonzini
6fec21449a KVM: x86: use correct page table format to check nested page table reserved bits
Intel CPUID on AMD host or vice versa is a weird case, but it can
happen.  Handle it by checking the host CPU vendor instead of the
guest's in reset_tdp_shadow_zero_bits_mask.  For speed, the
check uses the fact that Intel EPT has an X (executable) bit while
AMD NPT has NX.

Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 10:31:28 +02:00
Paolo Bonzini
79a8059d24 KVM: svm: do not call kvm_set_cr0 from init_vmcb
kvm_set_cr0 may want to call kvm_zap_gfn_range and thus access the
memslots array (SRCU protected).  Using a mini SRCU critical section
is ugly, and adding it to kvm_arch_vcpu_create doesn't work because
the VMX vcpu_create callback calls synchronize_srcu.

Fixes this lockdep splat:

===============================
[ INFO: suspicious RCU usage. ]
4.3.0-rc1+ #1 Not tainted
-------------------------------
include/linux/kvm_host.h:488 suspicious rcu_dereference_check() usage!

other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-i38/17000:
 #0:  (&(&kvm->mmu_lock)->rlock){+.+...}, at: kvm_zap_gfn_range+0x24/0x1a0 [kvm]

[...]
Call Trace:
 dump_stack+0x4e/0x84
 lockdep_rcu_suspicious+0xfd/0x130
 kvm_zap_gfn_range+0x188/0x1a0 [kvm]
 kvm_set_cr0+0xde/0x1e0 [kvm]
 init_vmcb+0x760/0xad0 [kvm_amd]
 svm_create_vcpu+0x197/0x250 [kvm_amd]
 kvm_arch_vcpu_create+0x47/0x70 [kvm]
 kvm_vm_ioctl+0x302/0x7e0 [kvm]
 ? __lock_is_held+0x51/0x70
 ? __fget+0x101/0x210
 do_vfs_ioctl+0x2f4/0x560
 ? __fget_light+0x29/0x90
 SyS_ioctl+0x4c/0x90
 entry_SYSCALL_64_fastpath+0x16/0x73

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 10:31:22 +02:00
Denys Vlasenko
0b101e62af x86/asm: Force inlining of cpu_relax()
On x86, cpu_relax() simply calls rep_nop(), which generates one
instruction, PAUSE (aka REP NOP).

With this config:

  http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os

gcc-4.7.2 does not always inline rep_nop(): it generates several
copies of this:

  <rep_nop> (16 copies, 194 calls):
       55                      push   %rbp
       48 89 e5                mov    %rsp,%rbp
       f3 90                   pause
       5d                      pop    %rbp
       c3                      retq

See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122

This patch fixes this via s/inline/__always_inline/
on rep_nop() and cpu_relax().

( Forcing inlining only on rep_nop() causes GCC to
  deinline cpu_relax(), with almost no change in generated code).

      text     data      bss       dec     hex filename
  88118971 19905208 36421632 144445811 89c1173 vmlinux.before
  88118139 19905208 36421632 144444979 89c0e33 vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443096149-27291-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-25 09:44:34 +02:00
Geliang Tang
7e5560a564 perf/x86: Change test_aperfmperf() and test_intel() to static
Fixes the following sparse warnings:

 arch/x86/kernel/cpu/perf_event_msr.c:13:6: warning: symbol
 'test_aperfmperf' was not declared. Should it be static?

 arch/x86/kernel/cpu/perf_event_msr.c:18:6: warning: symbol
 'test_intel' was not declared. Should it be static?

Signed-off-by: Geliang Tang <geliangtang@163.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4588e8ab09638458f2451af572827108be3b4a36.1443123796.git.geliangtang@163.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-25 09:42:40 +02:00
Andy Lutomirski
3f2c5085ed x86/sched/64: Don't save flags on context switch (reinstated)
This reinstates the following commit:

  2c7577a758 ("sched/x86_64: Don't save flags on context switch")

which was reverted in:

  512255a2ad ("Revert 'sched/x86_64: Don't save flags on context switch'")

Historically, Linux has always saved and restored EFLAGS across
context switches.  As far as I know, the only reason to do this
is because of the NT flag.  In particular, if something calls
switch_to() with the NT flag set, then we don't want to leak the
NT flag into a different task that might try to IRET and fail
because NT is set.

Before this commit:

  8c7aa698ba ("x86_64, entry: Filter RFLAGS.NT on entry from userspace")

we could run system call bodies with NT set.  This would be a DoS or possibly
privilege escalation hole if scheduling in such a system call would leak
NT into a different task.

Importantly, we don't need to worry about NT being set while
preemptible or across page faults.  The only way we can schedule
due to preemption or a page fault is in an interrupt entry that
nests inside the SYSENTER prologue.  The CPU will clear NT when
entering through an interrupt gate, so we won't schedule with NT
set.

The only other interesting flags are IOPL and AC.  Allowing
switch_to() to change IOPL has no effect, as the value loaded
during kernel execution doesn't matter at all except between a
SYSENTER entry and the subsequent PUSHF, and anythign that
interrupts in that window will restore IOPL on return.

If we call __switch_to() with AC set, we have bigger problems.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/d4440fdc2a89247bffb7c003d2a9a2952bd46827.1441146105.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-25 09:29:17 +02:00
Ingo Molnar
9c9ab385bc Merge branch 'linus' into x86/asm, to refresh the tree before applying new changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-25 09:28:58 +02:00
Linus Torvalds
4401555a98 DeviceTree fixes for 4.3:
- Silence bogus warning for of_irq_parse_pci
 - Fix typo in ARM idle-states binding doc and dts files
 - Various minor binding documentation updates
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWBI2/AAoJEMhvYp4jgsXi08cH/11S/92w1PLNmWCv4y7TXHQw
 XTJi7/Hk+q6otV/FigXukdYClR17h/3mFyCKKXHrkAgy/AoSrvN/ABe0bLLoT2AQ
 xh0Rx8F6vZwa6ro1MZcrn3ZxQkoJlNUhoIXtY84oSPWd1ernLOar6HonFiynCQQc
 bDZo5zoLj6DBbSO+UpVvEQN57ogwPgFZ1hGDjJeyyH8c1755z2OVA+k8O0dwjmqW
 Xav/7TO8bFEIbUnfWeKnVyK45qmJxHcTbn3nxUgYFQj3DMI2Hn86WEY4cg8QvTo+
 SpdO1Aio6b9NSTNnpiSvPnc2MFNPFaWjiqm1w86w+PTm8oUT+p1V0OG/EE27fcY=
 =VUfk
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-fixes-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:
 - Silence bogus warning for of_irq_parse_pci
 - Fix typo in ARM idle-states binding doc and dts files
 - Various minor binding documentation updates

* tag 'devicetree-fixes-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  Documentation: arm: Fix typo in the idle-states bindings examples
  gpio: mention in DT binding doc that <name>-gpio is deprecated
  of_pci_irq: Silence bogus "of_irq_parse_pci() failed ..." messages.
  devicetree: bindings: Extend the bma180 bindings with bma250 info
  of: thermal: Mark cooling-*-level properties optional
  of: thermal: Fix inconsitency between cooling-*-state and cooling-*-level
  Docs: dt: add #msi-cells to GICv3 ITS binding
  of: add vendor prefix for Socionext Inc.
2015-09-24 17:46:38 -07:00
Olof Johansson
fe5b2756c1 Add the ddc-i2c-bus reference to the veyron hdmi nodes,
so that they can read the edid of connected displays.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJWAdpbAAoJEPOmecmc0R2B9DgH/1Il++FcqJRQfk1HoSEXeD3s
 A2PX1eDlE984LcA2G/JQZ10iZxcLx07jg4pxRUcEDJ7iMdFrBFg4B7uHL5iM6kp5
 iAhSkRthsuwTpY9ySLe03izjF0a/9+0FvGdG939x7Hlfk4qhtb+ynzvMJcZ9Bs3n
 oxvvXqBBkEbK1vnySJkDnDCnK3fRBUse7QRcxKY2YdSYyxdw9Ga51hReF71pyH4u
 t7+8+4Pniz68i22YYW/zbGsO79f9tpmT7pisIfNnUOu11gDkL9oX3WkzeJWNh5BC
 9guT9CF1c+DLshSWYVDHZ+TCjwVFpgpz3osAKd10FUzWQctU1H/4FGaYviwNILY=
 =c2bs
 -----END PGP SIGNATURE-----

Merge tag 'v4.3-rockchip32-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes

Add the ddc-i2c-bus reference to the veyron hdmi nodes,
so that they can read the edid of connected displays.

* tag 'v4.3-rockchip32-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  ARM: dts: Add ddc i2c reference to veyron

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-09-24 16:51:43 -07:00
Benjamin Gaignard
79a313f5a5 ARM: sti: dt: adapt DT to fix probe/bind issues in DRM driver
STI drm drivers probe and bind using component framework was incorrect.
In addition to drivers fix DT update is needed to make all sub-components
become childs of sti-display-subsystem.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
2015-09-24 16:50:21 -07:00
Kishon Vijay Abraham I
9a5e3f27d1 ARM: dts: fix omap2+ address translation for pbias
"ARM: dts: <omap2/omap4/omap5/dra7>: add minimal l4 bus
layout with control module support" moved pbias_regulator dt node
from being a child node of ocp to be the child node of
'syscon'. Since 'syscon' doesn't have the 'ranges' property,
address translation fails while trying to convert the address
to resource. Fix it here by populating 'ranges' property in
syscon dt node.

Fixes: 72b10ac00e ("ARM: dts: omap24xx: add minimal l4 bus
layout with control module support")

Fixes: 7415b0b4c6 ("ARM: dts: omap4: add minimal l4 bus layout
with control module support")

Fixes: ed8509eddd ("ARM: dts: omap5: add minimal l4 bus
layout with control module support")

Fixes: d919501fef ("ARM: dts: dra7: add minimal l4 bus
layout with control module support")

Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
[tony@atomide.com: fixed omap3 pbias to work]
Signed-off-by: Tony Lindgren <tony@atomide.com>
2015-09-24 16:28:32 -07:00
Lorenzo Pieralisi
a13f18f59d Documentation: arm: Fix typo in the idle-states bindings examples
The idle-states bindings mandate that the entry-method string
in the idle-states node must be "psci" for ARM v8 64-bit systems,
but the examples in the bindings report a wrong entry-method string.
Owing to this typo, some dts in the kernel wrongly defined the
entry-method property, since they likely cut and pasted the example
definition without paying attention to the bindings definitions.

This patch fixes the typo in the DT idle states bindings examples and
respective dts in the kernel so that the bindings and related dts
files are made compliant.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Howard Chen <howard.chen@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Rob Herring <robh@kernel.org>
2015-09-24 17:55:32 -05:00
Felipe F. Tonello
0af8221108 ARM: dts: fix usb pin control for imx-rex dts
This fixes a duplicated pin control causing this error:

imx6q-pinctrl 20e0000.iomuxc: pin MX6Q_PAD_GPIO_1 already
requested by regulators:regulator@2; cannot claim for 2184000.usb
imx6q-pinctrl 20e0000.iomuxc: pin-137 (2184000.usb) status -22
imx6q-pinctrl 20e0000.iomuxc: could not request pin 137
(MX6Q_PAD_GPIO_1) from group usbotggrp  on device 20e0000.iomuxc
imx_usb 2184000.usb: Error applying setting, reverse things
back
imx6q-pinctrl 20e0000.iomuxc: pin MX6Q_PAD_EIM_D31 already
requested by regulators:regulator@1; cannot claim for 2184200.usb
imx6q-pinctrl 20e0000.iomuxc: pin-52 (2184200.usb) status -22
imx6q-pinctrl 20e0000.iomuxc: could not request pin 52 (MX6Q_PAD_EIM_D31)
from group usbh1grp  on device 20e0000.iomuxc
imx_usb 2184200.usb: Error applying setting, reverse things
back

Signed-off-by: Felipe F. Tonello <eu@felipetonello.com>
Fixes: e2047e33f2 ("ARM: dts: add initial Rex Pro board support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2015-09-24 05:13:40 -07:00
Russell King
274e91b81e ARM: alignment: fix alignment handling for uaccess changes
Jonathan Liu reports that the recent addition of CPU_SW_DOMAIN_PAN
causes wpa_supplicant to die due to the following kernel oops:

Unhandled fault: page domain fault (0x81b) at 0x001017a2
pgd = ee1b8000
[001017a2] *pgd=6ebee831, *pte=6c35475f, *ppte=6c354c7f
Internal error: : 81b [#1] SMP ARM
Modules linked in: rt2800usb rt2x00usb rt2800librt2x00lib crc_ccitt mac80211
CPU: 1 PID: 202 Comm: wpa_supplicant Not tainted 4.3.0-rc2 #1
Hardware name: Allwinner sun7i (A20) Family
task: ec872f80 ti: ee364000 task.ti: ee364000
PC is at do_alignment_ldmstm+0x1d4/0x238
LR is at 0x0
pc : [<c001d1d8>]    lr : [<00000000>]    psr: 600c0113
sp : ee365e18  ip : 00000000  fp : 00000002
r10: 001017a2  r9 : 00000002  r8 : 001017aa
r7 : ee365fb0  r6 : e8820018  r5 : 001017a2  r4 : 00000003
r3 : d49e30e0  r2 : 00000000  r1 : ee365fbc  r0 : 00000000
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none[   34.393106] Control: 10c5387d  Table: 6e1b806a  DAC: 00000051
Process wpa_supplicant (pid: 202, stack limit = 0xee364210)
Stack: (0xee365e18 to 0xee366000)
...
[<c001d1d8>] (do_alignment_ldmstm) from [<c001d510>] (do_alignment+0x1f0/0x904)
[<c001d510>] (do_alignment) from [<c00092a0>] (do_DataAbort+0x38/0xb4)
[<c00092a0>] (do_DataAbort) from [<c0013d7c>] (__dabt_usr+0x3c/0x40)
Exception stack(0xee365fb0 to 0xee365ff8)
5fa0:                                     00000000 56c728c0 001017a2 d49e30e0
5fc0: 775448d2 597d4e74 00200800 7a9e1625 00802001 00000021 b6deec84 00000100
5fe0: 08020200 be9f4f20 0c0b0d0a b6d9b3e0 600c0010 ffffffff
Code: e1a0a005 e1a0000c 1affffe8 e5913000 (e4ea3001)
---[ end trace 0acd3882fcfdf9dd ]---

This is caused by the alignment handler not being fixed up for the
uaccess changes, and userspace issuing an unaligned LDM instruction.
So, fix the problem by adding the necessary fixups.

Reported-by: Jonathan Liu <net147@gmail.com>
Tested-by: Jonathan Liu <net147@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-24 11:07:00 +01:00
Borislav Petkov
158ecc3918 x86/fpu: Fixup uninitialized feature_name warning
Hand in &feature_name to cpu_has_xfeatures() as it is supposed
to. Fixes an uninitialized warning.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: brgerst@gmail.com
Cc: dvlasenk@redhat.com
Cc: fenghua.yu@intel.com
Cc: luto@amacapital.net
Cc: tim.c.chen@linux.intel.com
Fixes: d91cab7813 ("x86/fpu: Rename XSAVE macros")
Link: http://lkml.kernel.org/r/20150923104901.GA3538@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-24 09:21:20 +02:00
Kristen Carlson Accardi
a7adb91b13 x86/cpufeatures: Correct spelling of the HWP_NOTIFY flag
Because noitification just isn't right.

Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/1442944296-11737-1-git-send-email-kristen@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-23 09:57:24 +02:00
Peter Zijlstra
62e8a3258b atomic, arch: Audit atomic_{read,set}()
This patch makes sure that atomic_{read,set}() are at least
{READ,WRITE}_ONCE().

We already had the 'requirement' that atomic_read() should use
ACCESS_ONCE(), and most archs had this, but a few were lacking.
All are now converted to use READ_ONCE().

And, by a symmetry and general paranoia argument, upgrade atomic_set()
to use WRITE_ONCE().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: james.hogan@imgtec.com
Cc: linux-kernel@vger.kernel.org
Cc: oleg@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-23 09:54:28 +02:00
Martin Schwidefsky
22be9cd9f2 s390/numa: use correct type for node_to_cpumask_map
With CONFIG_CPUMASK_OFFSTACK=y cpumask_var_t is a pointer to a CPU mask.
Replace the incorrect type for node_to_cpumask_map with cpumask_t.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-09-23 09:18:56 +02:00
Andrey Ryabinin
769a8089c1 x86, efi, kasan: #undef memset/memcpy/memmove per arch
In not-instrumented code KASAN replaces instrumented memset/memcpy/memmove
with not-instrumented analogues __memset/__memcpy/__memove.

However, on x86 the EFI stub is not linked with the kernel.  It uses
not-instrumented mem*() functions from arch/x86/boot/compressed/string.c

So we don't replace them with __mem*() variants in EFI stub.

On ARM64 the EFI stub is linked with the kernel, so we should replace
mem*() functions with __mem*(), because the EFI stub runs before KASAN
sets up early shadow.

So let's move these #undef mem* into arch's asm/efi.h which is also
included by the EFI stub.

Also, this will fix the warning in 32-bit build reported by kbuild test
robot:

	efi-stub-helper.c:599:2: warning: implicit declaration of function 'memcpy'

[akpm@linux-foundation.org: use 80 cols in comment]
Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Reported-by: Fengguang Wu <fengguang.wu@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-22 15:09:53 -07:00
Andy Lutomirski
83c133cf11 x86/nmi/64: Fix a paravirt stack-clobbering bug in the NMI code
The NMI entry code that switches to the normal kernel stack needs to
be very careful not to clobber any extra stack slots on the NMI
stack.  The code is fine under the assumption that SWAPGS is just a
normal instruction, but that assumption isn't really true.  Use
SWAPGS_UNSAFE_STACK instead.

This is part of a fix for some random crashes that Sasha saw.

Fixes: 9b6e6a8334 ("x86/nmi/64: Switch stacks on userspace NMI entry")
Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/974bc40edffdb5c2950a5c4977f821a446b76178.1442791737.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 22:40:36 +02:00
Andy Lutomirski
fc57a7c680 x86/paravirt: Replace the paravirt nop with a bona fide empty function
PARAVIRT_ADJUST_EXCEPTION_FRAME generates this code (using nmi as an
example, trimmed for readability):

    ff 15 00 00 00 00       callq  *0x0(%rip)        # 2796 <nmi+0x6>
              2792: R_X86_64_PC32     pv_irq_ops+0x2c

That's a call through a function pointer to regular C function that
does nothing on native boots, but that function isn't protected
against kprobes, isn't marked notrace, and is certainly not
guaranteed to preserve any registers if the compiler is feeling
perverse.  This is bad news for a CLBR_NONE operation.

Of course, if everything works correctly, once paravirt ops are
patched, it gets nopped out, but what if we hit this code before
paravirt ops are patched in?  This can potentially cause breakage
that is very difficult to debug.

A more subtle failure is possible here, too: if _paravirt_nop uses
the stack at all (even just to push RBP), it will overwrite the "NMI
executing" variable if it's called in the NMI prologue.

The Xen case, perhaps surprisingly, is fine, because it's already
written in asm.

Fix all of the cases that default to paravirt_nop (including
adjust_exception_frame) with a big hammer: replace paravirt_nop with
an asm function that is just a ret instruction.

The Xen case may have other problems, so document them.

This is part of a fix for some random crashes that Sasha saw.

Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/8f5d2ba295f9d73751c33d97fda03e0495d9ade0.1442791737.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 22:40:28 +02:00
Daniel J Blueman
ce2e572cfe x86/numachip: Introduce Numachip2 timer mechanisms
Add 1GHz 64-bit Numachip2 clocksource timer support for accurate
system-wide timekeeping, as core TSCs are unsynchronised.

Additionally, add a per-core clockevent mechanism that interrupts via the
platform IPI vector after a programmed period.

[ tglx: Taking it through x86 due to dependencies ]

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Steffen Persvold <sp@numascale.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1442829745-29311-1-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 22:25:33 +02:00
Daniel J Blueman
ad03a9c25d x86/numachip: Add Numachip IPI optimisations
When sending IPIs, first check if the non-local part of the source and
destination APIC IDs match; if so, send via the local APIC for efficiency.

Secondly, since the AMD BIOS-kernel developer guide states IPI delivery
will occur invarient of prior deliver status, avoid polling the delivery
status bit for efficiency.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Steffen Persvold <sp@numascale.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1442768522-19217-3-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 22:25:33 +02:00
Daniel J Blueman
d9d4dee6ce x86/numachip: Add Numachip2 APIC support
Introduce support for Numachip2 remote interrupts via detecting the right
ACPI SRAT signature.

Access is performed via a fixed mapping in the x86 physical address space.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Steffen Persvold <sp@numascale.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1442768522-19217-2-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 22:25:33 +02:00
Daniel J Blueman
db1003a719 x86/numachip: Cleanup Numachip support
Drop unused code and includes in Numachip header files and APIC driver.

Additionally, use the 'numachip1' prefix on Numachip1-specific functions;
this prepares for adding Numachip2 support in later patches.

Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Steffen Persvold <sp@numascale.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: http://lkml.kernel.org/r/1442768522-19217-1-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 22:25:32 +02:00
Paolo Bonzini
5b6a7175bf Merge branch 'kvm-ppc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master 2015-09-22 22:01:46 +02:00
Toshi Kani
55696b1f66 x86/mm: Fix no-change case in try_preserve_large_page()
try_preserve_large_page() checks if new_prot is the same as
old_prot.  If so, it simply sets do_split to 0, and returns
with no-operation.  However, old_prot is set as a 4KB pgprot
value while new_prot is a large page pgprot value.

Now that old_prot is initially set from p?d_pgprot() as a
large page pgprot value, fix it by not overwriting old_prot
with a 4KB pgprot value.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Robert Elliot <elliott@hpe.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1442514264-12475-12-git-send-email-toshi.kani@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 21:27:33 +02:00
Toshi Kani
d551aaa2f7 x86/mm: Fix __split_large_page() to handle large PAT bit
__split_large_page() is called from __change_page_attr() to change
the mapping attribute by splitting a given large page into smaller
pages.  This function uses pte_pfn() and pte_pgprot() for PUD/PMD,
which do not handle the large PAT bit properly.

Fix __split_large_page() by using the corresponding pud/pmd pfn/
pgprot interfaces.

Also remove '#ifdef CONFIG_X86_64', which is not necessary.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Robert Elliot <elliott@hpe.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1442514264-12475-11-git-send-email-toshi.kani@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 21:27:33 +02:00
Toshi Kani
3a19109efb x86/mm: Fix try_preserve_large_page() to handle large PAT bit
try_preserve_large_page() is called from __change_page_attr() to
change the mapping attribute of a given large page.  This function
uses pte_pfn() and pte_pgprot() for PUD/PMD, which do not handle
the large PAT bit properly.

Fix try_preserve_large_page() by using the corresponding pud/pmd
prot/pfn interfaces.

Also remove '#ifdef CONFIG_X86_64', which is not necessary.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Robert Elliot <elliott@hpe.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1442514264-12475-10-git-send-email-toshi.kani@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-22 21:27:33 +02:00