- Yet another race with VM destruction plugged
- A set of small vgic fixes
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlmDOZ4VHG1hcmMuenlu
Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDnwQP/ji7d7bbWybmeuWKXm8TYt6CNPBr
JC57OQHytU1+ccagjeUPaAgUAqWSQWOwfbqPuS1afshJu1ZZsN851QNmcr5W6bXQ
EwjICGWcAdKSMLnhRDdf+uXQbFSkkcaxsXnJWjmD0EE8ylaXCCkdV278xhTx0hVO
yiov/xWNzLMa3O3W/l4SIKQ4UNyeQ7f+Od1Vkf7iUpaaFcz6s3RNsPAOwc7Thq7L
eLk4SgXMLDNHz5HwdLoOp3RwQsNe9A7uh05z9VOav0YAyLG5vf29JhMroCO/4P/x
1HgWvpGwzxAUqUcHbW4FSOsbydEltlWMdfSoAF7BtPCLudmmsxfMJJlK3FKLvV+P
MsO2FdXiz6/ZLI9/ds7YJDOfc1cGSVK07Efx+SLXD5FAnkFYq4SElmI4sK8BWc5U
Dugw3j8u9M6jTiVKBioFo7yRT5iHG9O0A+6DtsMQ0iREfLolC7EEzpfGB/vWpKk5
BbdwDUiu6JxXEBJJqyP948iGfuIrikAx2mcf3dmEHVKhEnKi+MN/H5c9BKGAJ3sf
Fm1xF99IG+m3L3zXiukQ11OqsCx32kqLiXaejVttscFg/C33OYmGOhsk066hHCzx
LtoxrHUl7rwF+ssnKHq4Az2mYzZtquEq13O5tgleQcVwH/2+6EgWDf3mDSy1em0h
5IXgqk+PsaQ5n0jL
=TMR9
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-v4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/ARM Fixes for v4.13-rc4
- Yet another race with VM destruction plugged
- A set of small vgic fixes
------------[ cut here ]------------
WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7
RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel]
Call Trace:
vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
? vmx_check_nested_events+0x131/0x1f0 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm]
? vmx_vcpu_load+0x1be/0x220 [kvm_intel]
? kvm_arch_vcpu_load+0x62/0x230 [kvm]
kvm_vcpu_ioctl+0x340/0x700 [kvm]
? kvm_vcpu_ioctl+0x340/0x700 [kvm]
? __fget+0xfc/0x210
do_vfs_ioctl+0xa4/0x6a0
? __fget+0x11d/0x210
SyS_ioctl+0x79/0x90
do_syscall_64+0x8f/0x750
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL64_slow_path+0x25/0x25
This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which
means that tells the kernel to not make use of any IOAPICs that may be present
in the system.
Actually external_intr variable in nested_vmx_vmexit() is the req_int_win
variable passed from vcpu_enter_guest() which means that the L0's userspace
requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) &&
L0's userspace reqeusts an irq window) is true, so there is no interrupt which
L1 requires to inject to L2, we should not attempt to emualte "Acknowledge
interrupt on exit" for the irq window requirement in this scenario.
This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit"
if there is no L1 requirement to inject an interrupt to L2.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Added code comment to make it obvious that the behavior is not correct.
We should do a userspace exit with open interrupt window instead of the
nested VM exit. This patch still improves the behavior, so it was
accepted as a (temporary) workaround.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The host physical addresses of L1's Virtual APIC Page and Posted
Interrupt descriptor are loaded into the VMCS02. The CPU may write
to these pages via their host physical address while L2 is running,
bypassing address-translation-based dirty tracking (e.g. EPT write
protection). Mark them dirty on every exit from L2 to prevent them
from getting out of sync with dirty tracking.
Also mark the virtual APIC page and the posted interrupt descriptor
dirty when KVM is virtualizing posted interrupt processing.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
According to the Intel SDM, software cannot rely on the current VMCS to be
coherent after a VMXOFF or shutdown. So this is a valid way to handle VMCS12
flushes.
24.11.1 Software Use of Virtual-Machine Control Structures
...
If a logical processor leaves VMX operation, any VMCSs active on
that logical processor may be corrupted (see below). To prevent
such corruption of a VMCS that may be used either after a return
to VMX operation or on another logical processor, software should
execute VMCLEAR for that VMCS before executing the VMXOFF instruction
or removing power from the processor (e.g., as part of a transition
to the S3 and S4 power states).
...
This fixes a "suspicious rcu_dereference_check() usage!" warning during
kvm_vm_release() because nested_release_vmcs12() calls
kvm_vcpu_write_guest_page() without holding kvm->srcu.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Since the current implementation of VMCS12 does a memcpy in and out
of guest memory, we do not need current_vmcs12 and current_vmcs12_page
anymore. current_vmptr is enough to read and write the VMCS12.
And David Matlack noted:
This patch also fixes dirty tracking (memslot->dirty_bitmap) of the
VMCS12 page by using kvm_write_guest. nested_release_page() only marks
the struct page dirty.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[Added David Matlack's note and nested_release_page_clean() fix.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
'lapic_irq' is a local variable and its 'level' field isn't
initialized, so 'level' is random, it doesn't matter but
makes UBSAN unhappy:
UBSAN: Undefined behaviour in .../lapic.c:...
load of value 10 is not a valid value for type '_Bool'
...
Call Trace:
[<ffffffff81f030b6>] dump_stack+0x1e/0x20
[<ffffffff81f03173>] ubsan_epilogue+0x12/0x55
[<ffffffff81f03b96>] __ubsan_handle_load_invalid_value+0x118/0x162
[<ffffffffa1575173>] kvm_apic_set_irq+0xc3/0xf0 [kvm]
[<ffffffffa1575b20>] kvm_irq_delivery_to_apic_fast+0x450/0x910 [kvm]
[<ffffffffa15858ea>] kvm_irq_delivery_to_apic+0xfa/0x7a0 [kvm]
[<ffffffffa1517f4e>] kvm_emulate_hypercall+0x62e/0x760 [kvm]
[<ffffffffa113141a>] handle_vmcall+0x1a/0x30 [kvm_intel]
[<ffffffffa114e592>] vmx_handle_exit+0x7a2/0x1fa0 [kvm_intel]
...
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
When SMP VM start, AP may lost INIT because of receiving INIT between
kvm_vcpu_ioctl_x86_get/set_vcpu_events.
vcpu 0 vcpu 1
kvm_vcpu_ioctl_x86_get_vcpu_events
events->smi.latched_init = 0
send INIT to vcpu1
set vcpu1's pending_events
kvm_vcpu_ioctl_x86_set_vcpu_events
if (events->smi.latched_init == 0)
clear INIT in pending_events
This patch fixes it by just update SMM related flags if we are in SMM.
Thanks Peng Hao for the report and original commit message.
Reported-by: Peng Hao <peng.hao2@zte.com.cn>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The number of pins in South Bridge is 30 and not 29. There is a fix for
the driver for the pinctrl, but a fix is also need at device tree level
for the GPIO.
Fixes: afda007fed ("ARM64: dts: marvell: Add pinctrl nodes for Armada
3700")
Cc: <stable@vger.kernel.org>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
of_irq_to_resource() has recently been fixed to return negative error #'s
along with 0 in case of failure, however the Freescale MPC832x RDB board
code still only regards 0 as a failure indication -- fix it up.
Fixes: 7a4228bbff ("of: irq: use of_irq_get() in of_irq_to_resource()")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
WARNING: CPU: 5 PID: 1242 at kernel/rcu/tree_plugin.h:323 rcu_note_context_switch+0x207/0x6b0
CPU: 5 PID: 1242 Comm: unity-settings- Not tainted 4.13.0-rc2+ #1
RIP: 0010:rcu_note_context_switch+0x207/0x6b0
Call Trace:
__schedule+0xda/0xba0
? kvm_async_pf_task_wait+0x1b2/0x270
schedule+0x40/0x90
kvm_async_pf_task_wait+0x1cc/0x270
? prepare_to_swait+0x22/0x70
do_async_page_fault+0x77/0xb0
? do_async_page_fault+0x77/0xb0
async_page_fault+0x28/0x30
RIP: 0010:__d_lookup_rcu+0x90/0x1e0
I encounter this when trying to stress the async page fault in L1 guest w/
L2 guests running.
Commit 9b132fbe54 (Add rcu user eqs exception hooks for async page
fault) adds rcu_irq_enter/exit() to kvm_async_pf_task_wait() to exit cpu
idle eqs when needed, to protect the code that needs use rcu. However,
we need to call the pair even if the function calls schedule(), as seen
from the above backtrace.
This patch fixes it by informing the RCU subsystem exit/enter the irq
towards/away from idle for both n.halted and !n.halted.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
There are three issues in nested_vmx_check_exception:
1) it is not taking PFEC_MATCH/PFEC_MASK into account, as reported
by Wanpeng Li;
2) it should rebuild the interruption info and exit qualification fields
from scratch, as reported by Jim Mattson, because the values from the
L2->L0 vmexit may be invalid (e.g. if an emulated instruction causes
a page fault, the EPT misconfig's exit qualification is incorrect).
3) CR2 and DR6 should not be written for exception intercept vmexits
(CR2 only for AMD).
This patch fixes the first two and adds a comment about the last,
outlining the fix.
Cc: Jim Mattson <jmattson@google.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do this in the caller of nested_vmx_vmexit instead.
nested_vmx_check_exception was doing a vmwrite to the vmcs02's
VM_EXIT_INTR_ERROR_CODE field, so that prepare_vmcs12 would move
the field to vmcs12->vm_exit_intr_error_code. However that isn't
possible on pre-Haswell machines. Moving the vmcs12 write to the
callers fixes it.
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Changed nested_vmx_reflect_vmexit() return type to (int)1 from (bool)1,
thanks to fengguang.wu@intel.com]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Functions clear_user_highpage, copy_user_highpage, flush_dcache_page,
local_flush_cache_range and local_flush_cache_page may be used from
modules. Export them.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
csum_partial and csum_partial_copy_generic are defined unconditionally
and are available even when CONFIG_NET is disabled. They are used not
only by the network drivers, but also by scsi and media.
Don't limit these functions export by CONFIG_NET.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
In an ideal world, CNTFRQ_EL0 always contains the timer frequency
for the kernel to use. Sadly, we get quite a few broken systems
where the firmware authors cannot be bothered to program that
register on all CPUs, and rely on DT to provide that frequency.
So when trapping CNTFRQ_EL0, make sure to return the actual rate
(as known by the kernel), and not CNTFRQ_EL0.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The HPET resume path abuses irq_domain_[de]activate_irq() to restore the
MSI message in the HPET chip for the boot CPU on resume and it relies on an
implementation detail of the interrupt core code, which magically makes the
HPET unmask call invoked via a irq_disable/enable pair. This worked as long
as the irq code did unconditionally invoke the unmask() callback. With the
recent changes which keep track of the masked state to avoid expensive
hardware access, this does not longer work. As a consequence the HPET timer
interrupts are not unmasked which breaks resume as the boot CPU waits
forever that a timer interrupt arrives.
Make the restore of the MSI message explicit and invoke the unmask()
function directly. While at it get rid of the pointless affinity setting as
nothing can change the affinity of the interrupt and the vector across
suspend/resume. The restore of the MSI message reestablishes the previous
affinity setting which is the correct one.
Fixes: bf22ff45be ("genirq: Avoid unnecessary low level irq function calls")
Reported-and-tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Reported-by: Martin Peres <martin.peres@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: jeffy.chen@rock-chips.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707312158590.2287@nanos
While working on enabling queued rwlock on SPARC, found this following
code in include/asm-generic/qrwlock.h which uses CONFIG_CPU_BIG_ENDIAN
to clear a byte.
static inline u8 *__qrwlock_write_byte(struct qrwlock *lock)
{
return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
}
Problem is many of the fixed big endian architectures don't define
CPU_BIG_ENDIAN and clears the wrong byte.
Define CPU_BIG_ENDIAN for parisc architecture to fix it.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Helge Deller <deller@gmx.de>
The watchdog soft-NMI exception stack setup loads a stack pointer
twice, which is an obvious error. It ends up using the system reset
interrupt (true-NMI) stack, which is also a bug because the watchdog
could be preempted by a system reset interrupt that overwrites the
NMI stack.
Change the soft-NMI to use the "emergency stack". The current kernel
stack is not used, because of the longer-term goal to prevent
asynchronous stack access using soft-disable.
Fixes: 2104180a53 ("powerpc/64s: implement arch-specific hardlockup watchdog")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZapWhAAoJEHm+PkMAQRiGKb0IAJM6b7SbWaw69Og7+qiFB+zZ
xp29iXqbE9fPISC6a5BRQV1ONjeDM6opGixGHqGC8Hla6k2IYz25VDNoF8wd0MXN
cz/Ih20vd3C5afxXGe5cTT8lsPAlV0mWXxForlu6j8jPeL62FPfq6RhEkw7AcrYL
yfYy3k3qSdOrrvBdII0WAAUi46UfIs+we9BQgbsMbkHOiqV2K0MOrzKE84Xbgepq
RAy2xg6P4b4+hTx8xTrYc1MXwpnqjRc0oJ08gdmiwW3AOOU7LxYFn7zDkLPWi9Rr
g4x6r4YhBTGxT4wNvovLIiqd9QFs//dMCuPWYwEtTICG48umIqqq24beQ0mvCdg=
=08Ic
-----END PGP SIGNATURE-----
Merge tag 'v4.13-rc1' into fixes
The fixes branch is based off a random pre-rc1 commit, because we had
some fixes that needed to go in before rc1 was released.
However we now need to fix some code that went in after that point, but
before rc1, so merge rc1 to get that code into fixes so we can fix it!
Since kernel 4.11 the thread and irq stacks on parisc randomly overflow
the default size of 16k. The reason why stack usage suddenly grew is yet
unknown.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.11+
Signed-off-by: Helge Deller <deller@gmx.de>
In testing James' patch to drivers/parisc/pdc_stable.c, I hit the BUG
statement in flush_cache_range() during a system shutdown:
kernel BUG at arch/parisc/kernel/cache.c:595!
CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1
Workqueue: events free_ioctx
IAOQ[0]: flush_cache_range+0x144/0x148
IAOQ[1]: flush_cache_page+0x0/0x1a8
RP(r2): flush_cache_range+0xec/0x148
Backtrace:
[<00000000402910ac>] unmap_page_range+0x84/0x880
[<00000000402918f4>] unmap_single_vma+0x4c/0x60
[<0000000040291a18>] zap_page_range_single+0x110/0x160
[<0000000040291c34>] unmap_mapping_range+0x174/0x1a8
[<000000004026ccd8>] truncate_pagecache+0x50/0xa8
[<000000004026cd84>] truncate_setsize+0x54/0x70
[<000000004033d534>] put_aio_ring_file+0x44/0xb0
[<000000004033d5d8>] aio_free_ring+0x38/0x140
[<000000004033d714>] free_ioctx+0x34/0xa8
[<00000000401b0028>] process_one_work+0x1b8/0x4d0
[<00000000401b04f4>] worker_thread+0x1b4/0x648
[<00000000401b9128>] kthread+0x1b0/0x208
[<0000000040150020>] end_fault_vector+0x20/0x28
[<0000000040639518>] nf_ip_reroute+0x50/0xa8
[<0000000040638ed0>] nf_ip_route+0x10/0x78
[<0000000040638c90>] xfrm4_mode_tunnel_input+0x180/0x1f8
CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1
Workqueue: events free_ioctx
Backtrace:
[<0000000040163bf0>] show_stack+0x20/0x38
[<0000000040688480>] dump_stack+0xa8/0x120
[<0000000040163dc4>] die_if_kernel+0x19c/0x2b0
[<0000000040164d0c>] handle_interruption+0xa24/0xa48
This patch modifies flush_cache_range() to handle non current contexts.
In as much as this occurs infrequently, the simplest approach is to
flush the entire cache when this happens.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
Pull x86 fixes from Thomas Gleixner:
"A small set of x86 fixes:
- prevent the kernel from using the EFI reboot method when EFI is
disabled.
- two patches addressing clang issues"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Disable the address-of-packed-member compiler warning
x86/efi: Fix reboot_mode when EFI runtime services are disabled
x86/boot: #undef memcpy() et al in string.c
Pull perf fixes from Thomas Gleixner:
"A couple of fixes for performance counters and kprobes:
- a series of small patches which make the uncore performance
counters on Skylake server systems work correctly
- add a missing instruction slot release to the failure path of
kprobes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kprobes/x86: Release insn_slot in failure path
perf/x86/intel/uncore: Fix missing marker for skx_uncore_cha_extra_regs
perf/x86/intel/uncore: Fix SKX CHA event extra regs
perf/x86/intel/uncore: Remove invalid Skylake server CHA filter field
perf/x86/intel/uncore: Fix Skylake server CHA LLC_LOOKUP event umask
perf/x86/intel/uncore: Fix Skylake server PCU PMU event format
perf/x86/intel/uncore: Fix Skylake UPI PMU event masks
After commit f8475cef90 "x86: use common aperfmperf_khz_on_cpu() to
calculate KHz using APERF/MPERF" the scaling_cur_freq policy attribute
in sysfs only behaves as expected on x86 with APERF/MPERF registers
available when it is read from at least twice in a row. The value
returned by the first read may not be meaningful, because the
computations in there use cached values from the previous iteration
of aperfmperf_snapshot_khz() which may be stale.
To prevent that from happening, modify arch_freq_get_on_cpu() to
call aperfmperf_snapshot_khz() twice, with a short delay between
these calls, if the previous invocation of aperfmperf_snapshot_khz()
was too far back in the past (specifically, more that 1s ago).
Also, as pointed out by Doug Smythies, aperf_delta is limited now
and the multiplication of it by cpu_khz won't overflow, so simplify
the s->khz computations too.
Fixes: f8475cef90 "x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF"
Reported-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Simon Horman reported that Koelsch and Lager hang during boot, and
bisected this to commit 1c3c5eab17 ("sched/core: Enable
might_sleep() and smp_processor_id() checks early").
The da9063/da9210 regulator quirk for R-Car Gen2 boards uses a bus
notifier, and unregisters the notifier when it is no longer needed.
However, a notifier must not be unregistered from within the call chain.
This bug went unnoticed, as blocking_notifier_chain_unregister() didn't
take the semaphore during early boot. The aforementioned commit changed
that behavior, leading to a deadlock.
Fix this by removing the call to bus_unregister_notifier(), and keeping
local completion state instead.
Reported-by: Simon Horman <horms+renesas@verge.net.au>
Fixes: 663fbb5215 ("ARM: shmobile: R-Car Gen2: Add da9063/da9210 regulator quirk")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Currently building kernel for xtensa core with aliasing WT cache fails
with the following messages:
mm/memory.c:2152: undefined reference to `flush_dcache_page'
mm/memory.c:2332: undefined reference to `local_flush_cache_page'
mm/memory.c:1919: undefined reference to `local_flush_cache_range'
mm/memory.c:4179: undefined reference to `copy_to_user_page'
mm/memory.c:4183: undefined reference to `copy_from_user_page'
This happens because implementation of these functions is only compiled
when data cache is WB, which looks wrong: even when data cache doesn't
need flushing it still needs invalidation. The functions like
__flush_[invalidate_]dcache_* are correctly defined for both WB and WT
caches (and even if they weren't that'd still be ok, just slower).
Fix this by providing the same implementation of the above functions for
both WB and WT cache.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
PPC: host crash fixes.
x86: bugfixes, including making nested posted interrupts really work.
Generic: tweaks to kvm_stat and to uevents
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJZe2EYAAoJEL/70l94x66D4JYH/AnvioKWTsplhUKt4Y4JlpJX
EXYjQd/CIZ+MHNNUH+U+XEj6tKQymKrz4TeZSs1o0nyxCeyparR3gK27OYVpPspN
GkPSit3hyRgW9r5uXp6pZCJuFCAMpMZ6z4sKbT1FxDhnWnpWayV9w8KA+yQT/UUX
dNQ9JJPUxApcM4NCaj2OCQ8K1koNIDCc52+jATf0iK/Heiaf6UGqCcHXUIy5I5wM
OWk05Qm32VBAYb6P6FfoyGdLMNAAkJtr1fyOJDkxX730CYgwpjIP0zifnJ1bt8V2
YRnjvPO5QciDHbZ8VynwAkKi0ZAd8psjwXh0KbyahPL/2/sA2xCztMH25qweriI=
=fsfr
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"s390:
- SRCU fix
PPC:
- host crash fixes
x86:
- bugfixes, including making nested posted interrupts really work
Generic:
- tweaks to kvm_stat and to uevents"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: LAPIC: Fix reentrancy issues with preempt notifiers
tools/kvm_stat: add '-f help' to get the available event list
tools/kvm_stat: use variables instead of hard paths in help output
KVM: nVMX: Fix loss of L2's NMI blocking state
KVM: nVMX: Fix posted intr delivery when vcpu is in guest mode
x86: irq: Define a global vector for nested posted interrupts
KVM: x86: do mask out upper bits of PAE CR3
KVM: make pid available for uevents without debugfs
KVM: s390: take srcu lock when getting/setting storage keys
KVM: VMX: remove unused field
KVM: PPC: Book3S HV: Fix host crash on changing HPT size
KVM: PPC: Book3S HV: Enable TM before accessing TM registers
- Ensure we have a guard page after the kernel image in vmalloc
- Fix incorrect prefetch stride in copy_page
- Ensure irqs are disabled in die()
- Fix for event group validation in QCOM L2 PMU driver
- Fix requesting of PMU IRQs on AMD Seattle
- Minor cleanups and fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJZey1iAAoJELescNyEwWM0w/0H/1RaHFUSoFUIoL+qFD0eGXcp
hORI0sIHrUlHRONTFYMTyNko7kxELz5aDm6pc87dzBUNoUq3gxhqeEa0zsmwOPsQ
m4iDa7r9xXT+nBITe2auAg6miEMX7Ym448dDrIyKNcRK+2SyZoFqS0vr8UVqs1P/
NwdFGgpKHbV4r1Jeoosom+n7VnuyE0vYBKo8TlRks6NvQJoh2duiPkL+AsBgCfBq
fznck7jIPL4z4kf4Fp/Yz1QsmMhkDSidPmGD/m97Bj4wvEbMwf0u8Dnv1tySK5wx
NwKeN0Dn7JphtL5c5j+OGiri7gTcswjxHJ9f6d0Ez+2TwnjWFM6JNQ+xdVqFcxc=
=EpS9
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"I'd been collecting these whilst we debugged a CPU hotplug failure,
but we ended up diagnosing that one to tglx, who has taken a fix via
the -tip tree separately.
We're seeing some NFS issues that we haven't gotten to the bottom of
yet, and we've uncovered some issues with our backtracing too so there
might be another fixes pull before we're done.
Summary:
- Ensure we have a guard page after the kernel image in vmalloc
- Fix incorrect prefetch stride in copy_page
- Ensure irqs are disabled in die()
- Fix for event group validation in QCOM L2 PMU driver
- Fix requesting of PMU IRQs on AMD Seattle
- Minor cleanups and fixes"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mmu: Place guard page after mapping of kernel image
drivers/perf: arm_pmu: Request PMU SPIs with IRQF_PER_CPU
arm64: sysreg: Fix unprotected macro argmuent in write_sysreg
perf: qcom_l2: fix column exclusion check
arm64/lib: copy_page: use consistent prefetch stride
arm64/numa: Drop duplicate message
perf: Convert to using %pOF instead of full_name
arm64: Convert to using %pOF instead of full_name
arm64: traps: disable irq in die()
arm64: atomics: Remove '&' from '+&' asm constraint in lse atomics
arm64: uaccess: Remove redundant __force from addr cast in __range_ok
The highlight is Ben's patch to work around a host killing bug when running KVM
guests with the Radix MMU on Power9. See the long change log of that commit for
more detail.
And then three fairly minor fixes:
- Fix of_node_put() underflow during reconfig remove, using old DLPAR tools.
- Fix recently introduced ld version check with 64-bit LE-only toolchain.
- Free the subpage_prot_table correctly, avoiding a memory leak.
Thanks to:
Aneesh Kumar K.V, Benjamin Herrenschmidt, Laurent Vivier.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZezIGAAoJEFHr6jzI4aWAyukP/3mxlQ3WdQlYPByZ18cj6YL5
L0kbRxDgAosD9HzcOPqku1um1l7D6Gk5KZFZfsol7SasSmCZEwV4MbdTrAiRxS6K
tF0V14hP/BDQeIKfxlnUepzfL8PY7CkO6sDAa6BjHXvBk4+POI+37uw9+2GEV8DY
tA45fHjA/Zq3eUXsK0WTHIcd09lJXXarf9Tlx+YNZ+3yJ1OMfOji3CXgTkjtwYM9
XTtsKzsagY1zLwr5gXJu1P05+OGna2VmY6+Tn2lnf7scTFW3qYGF3eWRx71diiKS
PpZCjqfzWF4+TDIGPoYIrkTE+ZKR0lyo6F38GYwae0cYZMs9pGPEpeNahd8Nun+v
MLU6TnhNfOI40GEYgmOMNKHPJJLSx59Qr/GnrAi/h2nUEocuN76jzNbaeFBtj3jD
/vrRTmVUtt1wGqORX7BK4YZFHqcHmZBCM7bQnxibJtLv7fMue0sk58fs2jAaZ1iD
NacpzsXG7CWYgj6ApclVCYuF99dXTpjrw/WPxilXDg84Pxb7Dv1SpIvLb2T5Guq+
iqqavViRHP1ng+5/giIsOvF9CnsCzbRYLb0zZTP91nckMmYI6wX2zc56lofjcI5j
Qc5o/aJvBk4vSM9sibBGEdrZJ1Vt16gGorQ5NZUurZund/cVqvQFhm/4Tvnc0cVN
yvLNZI8am35pI9CCJ2im
=Z6uF
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"The highlight is Ben's patch to work around a host killing bug when
running KVM guests with the Radix MMU on Power9. See the long change
log of that commit for more detail.
And then three fairly minor fixes:
- fix of_node_put() underflow during reconfig remove, using old DLPAR
tools.
- fix recently introduced ld version check with 64-bit LE-only
toolchain.
- free the subpage_prot_table correctly, avoiding a memory leak.
Thanks to: Aneesh Kumar K.V, Benjamin Herrenschmidt, Laurent Vivier"
* tag 'powerpc-4.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/hash: Free the subpage_prot_table correctly
powerpc/Makefile: Fix ld version check with 64-bit LE-only toolchain
powerpc/pseries: Fix of_node_put() underflow during reconfig remove
powerpc/mm/radix: Workaround prefetch issue with KVM
- sunxi: Correct time phase settings
- omap_hsmmc: Clean up some dead code
- dw_mmc: Fix message printed for deprecated num-slots DT binding
- dw_mmc: Fix DT documentation
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZeu0JAAoJEP4mhCVzWIwpECAP/0kf5o3lEneUOqSGE9d1EszW
GpHgX0hA1+FaQpBUhGAfFCjmNigsH8rwz8dOv17PX8iFyQmal0ivgC7DXJPG2Rq1
ofC4MwfWzZE0AKRSLButvRyNHqmwZsSM8nlFPwAtsINktJCx/WhSr6OS5pNEdz/j
1tEGDgLzBiq9Yd3FHf07KPPkMhxut0eI1gXke8pRgFkLQIwU4/8zFb6450w0RIxQ
BtmqEEK0p3cyZLN/FxpyMG6ZVmypTUiMFX9G0xkcKdsTxGqnYpWvCFuEbEtx6vbU
5IjjKc2oINMs3z53tRiN/vQaSuZMn1O4dKHydADxP68Pm/ff09+pgvnpTV4D83RV
/gw9olO1Y//ONCT+p/k2fHhOlLUa4YY2+SUCN7VZAqP5gYEjtH9/doOoWND//WPA
BhFcZsWBoDva+M3OC2wNnZb5aCERVLuHPl3NhdiOpxyGoEEG1c0MvhegaUI4Rm0K
hoVyuXqWsbu+3A3H+biELb0VEIlgELkCIRh7mjKSq8oicPN7PtymhAgfGyxcCkn+
qcvlN0UxcJjYxYTXzKjEaTiCHeel0UB3toCWdfq+L5znHgRMjZoxYM/tUhF8+rts
wTcS/NkLz4DGJGQavfJvdawddafCbFnoL19KnWd2Wl3RxNOrQ2lnvfBODDuPrVoz
IJqzwddWO/w6gTDT5Xt6
=xWHL
-----END PGP SIGNATURE-----
Merge tag 'mmc-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"Here are a couple of mmc fixes intended for v4.13-rc1.
I have also included a couple of cleanup patches in this pull request
for OMAP2+, related to the omap_hsmmc driver. The reason is because of
the changes are also depending on OMAP SoC specific code, so this
simplifies how to deal with this.
Summary:
MMC host:
- sunxi: Correct time phase settings
- omap_hsmmc: Clean up some dead code
- dw_mmc: Fix message printed for deprecated num-slots DT binding
- dw_mmc: Fix DT documentation"
* tag 'mmc-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
Documentation: dw-mshc: deprecate num-slots
mmc: dw_mmc: fix the wrong condition check of getting num-slots from DT
mmc: host: omap_hsmmc: remove unused platform callbacks
ARM: OMAP2+: hsmmc.c: Remove dead code
mmc: sunxi: Keep default timing phase settings for new timing mode
Commit 8e3f1b1d82 ("powerpc/powernv/pci: Enable 64-bit devices to access
>4GB DMA space") introduced the ability for PCI device drivers to request a
DMA mask between 64 and 32 bits and actually get a mask greater than
32-bits. However currently if certain machine configuration dependent
conditions are not meet the code silently falls back to a 32-bit mask.
This makes it hard for device drivers to detect which mask they actually
got. Instead we should return an error when the request could not be
fulfilled which allows drivers to either fallback or implement other
workarounds as documented in DMA-API-HOWTO.txt.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Historically the boot wrapper was always built 32-bit big endian, even
for 64-bit kernels. That was because old firmwares didn't necessarily
support booting a 64-bit image. Because of that arch/powerpc/boot/Makefile
uses CROSS32CC for compilation.
However when we added 64-bit little endian support, we also added
support for building the boot wrapper 64-bit. However we kept using
CROSS32CC, because in most cases it is just CC and everything works.
However if the user doesn't specify CROSS32_COMPILE (which no one ever
does AFAIK), and CC is *not* biarch (32/64-bit capable), then CROSS32CC
becomes just "gcc". On native systems that is probably OK, but if we're
cross building it definitely isn't, leading to eg:
gcc ... -m64 -mlittle-endian -mabi=elfv2 ... arch/powerpc/boot/cpm-serial.c
gcc: error: unrecognized argument in option ‘-mabi=elfv2’
gcc: error: unrecognized command line option ‘-mlittle-endian’
make: *** [zImage] Error 2
To fix it, stop using CROSS32CC, because we may or may not be building
32-bit. Instead setup a BOOTCC, which defaults to CC, and only use
CROSS32_COMPILE if it's set and we're building for 32-bit.
Fixes: 147c05168f ("powerpc/boot: Add support for 64bit little endian wrapper")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
In smp_cpus_done() we need to call smp_ops->setup_cpu() for the boot
CPU, which means it has to run *on* the boot CPU.
In the past we ensured it ran on the boot CPU by changing the CPU
affinity mask of current directly. That was removed in commit
6d11b87d55 ("powerpc/smp: Replace open coded task affinity logic"),
and replaced with a work queue call.
Unfortunately using a work queue leads to a lockdep warning, now that
the CPU hotplug lock is a regular semaphore:
======================================================
WARNING: possible circular locking dependency detected
...
kworker/0:1/971 is trying to acquire lock:
(cpu_hotplug_lock.rw_sem){++++++}, at: [<c000000000100974>] apply_workqueue_attrs+0x34/0xa0
but task is already holding lock:
((&wfc.work)){+.+.+.}, at: [<c0000000000fdb2c>] process_one_work+0x25c/0x800
...
CPU0 CPU1
---- ----
lock((&wfc.work));
lock(cpu_hotplug_lock.rw_sem);
lock((&wfc.work));
lock(cpu_hotplug_lock.rw_sem);
Although the deadlock can't happen in practice, because
smp_cpus_done() only runs in early boot before CPU hotplug is allowed,
lockdep can't tell that.
Luckily in commit 8fb12156b8 ("init: Pin init task to the boot CPU,
initially") tglx changed the generic code to pin init to the boot CPU
to begin with. The unpinning of init from the boot CPU happens in
sched_init_smp(), which is called after smp_cpus_done().
So smp_cpus_done() is always called on the boot CPU, which means we
don't need the work queue call at all - and the lockdep warning goes
away.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
The vast majority of virtual allocations in the vmalloc region are followed
by a guard page, which can help to avoid overruning on vma into another,
which may map a read-sensitive device.
This patch adds a guard page to the end of the kernel image mapping (i.e.
following the data/bss segments).
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The clang warning 'address-of-packed-member' is disabled for the general
kernel code, also disable it for the x86 boot code.
This suppresses a bunch of warnings like this when building with clang:
./arch/x86/include/asm/processor.h:535:30: warning: taking address of
packed member 'sp0' of class or structure 'x86_hw_tss' may result in an
unaligned pointer value [-Waddress-of-packed-member]
return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
^~~~~~~~~~~~~~~~~~~
./arch/x86/include/asm/percpu.h:391:59: note: expanded from macro
'this_cpu_read_stable'
#define this_cpu_read_stable(var) percpu_stable_op("mov", var)
^~~
./arch/x86/include/asm/percpu.h:228:16: note: expanded from macro
'percpu_stable_op'
: "p" (&(var)));
^~~
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170725215053.135586-1-mka@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently flush_tmregs_to_thread() does not save the TM SPRs (TFHAR,
TFIAR, TEXASR) to the thread struct, unless the process is currently
inside a suspended transaction.
If the process is core dumping, and the TM SPRs have changed since the
last time the process was context switched, then we will save stale
values of the TM SPRs to the core dump.
Fix it by saving the live register state to the thread struct in that
case.
Fixes: 08e1c01d6a ("powerpc/ptrace: Enable support for TM SPR state")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The Radix MMU translation tree as defined in ISA v3.0 contains two
different types of entry, directories and leaves. Leaves are
identified by _PAGE_PTE being set.
The formats of the two entries are different, with the directory
entries containing no spare bits for use by software. In particular
the bit we use for _PAGE_DEVMAP is not reserved for software, and is
part of the NLB (Next Level Base) field, essentially the address of
the next level in the tree.
Note that the Linux pte_t is not == _PAGE_PTE. A huge page pmd
entry (or devmap!) is also a leaf and so has _PAGE_PTE set, even
though we use a pmd_t for it in Linux.
The fix is to ensure that the pmd/pte_devmap() confirm they are
looking at a leaf entry (_PAGE_PTE) as well as checking _PAGE_DEVMAP.
Fixes: ebd3119793 ("powerpc/mm: Add devmap support for ppc64")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Add a comment in the code and flesh out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The hardware spinlock drivers now depend on HWSPINLOCK (instead of
selecting it), so we need to explicitly enable it after commit
35fc8a07d7 ("Make HWSPINLOCK a menuconfig to ease disabling")
Without HWSPINLOCK, various drivers are left with unsatisfied
dependencies and Qcom boards using shared memory based communication
to request regulators are failing to boot and mount rootfs.
Fix this by explicitly enabling HWSPINLOCK in defconfig.
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
We get a link error trying to access the w100fb_gpio_read/write
functions from the platform when the driver is a loadable module
or not built-in, so the platform already uses 'select' to hard-enable
the driver.
However, that fails if the framebuffer subsystem is disabled
altogether.
I've considered various ways to fix this properly, but they
all seem like too much work or too risky, so this simply
adds another 'select' to force the subsystem on as well.
Fixes: 82427de2c7 ("ARM: pxa: PXA_ESERIES depends on FB_W100.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
An empty macro definition can cause unexpected behavior, in
case of the ixp4xx ioport_unmap, we get two warnings:
drivers/net/wireless/marvell/libertas/if_cs.c: In function 'if_cs_release':
drivers/net/wireless/marvell/libertas/if_cs.c:826:3: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
ioport_unmap(card->iobase);
drivers/vfio/pci/vfio_pci_rdwr.c: In function 'vfio_pci_vga_rw':
drivers/vfio/pci/vfio_pci_rdwr.c:230:15: error: the omitted middle operand in ?: will always be 'true', suggest explicit middle operand [-Werror=parentheses]
is_ioport ? ioport_unmap(iomem) : iounmap(iomem);
This uses an inline function to define the macro in a safer way.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Krzysztof Halasa <khalasa@piap.pl>
Just like ARCH_MULTIPLATFORM, we want to use ARM_PATCH_PHYS_VIRT
when possible, but that fails for NOMMU or XIP_KERNEL configurations.
Using 'imply' instead of 'select' gets this right and only uses
the symbol when we don't have to hardcode the address anyway.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
This variable may be used by some devices that each have their
on Kconfig symbol, or by none of them, and that causes a build
warning:
arch/arm/mach-mmp/devices.c:241:12: error: 'usb_dma_mask' defined but not used [-Werror=unused-variable]
Marking it __maybe_unused avoids the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The omap_generic_init() and omap_hwmod_init_postsetup() functions are
used in the initialization for all OMAP2+ SoC types, but in the
extreme case that those are all disabled, we get a warning about
unused code:
arch/arm/mach-omap2/io.c:412:123: error: 'omap_hwmod_init_postsetup' defined but not used [-Werror=unused-function]
arch/arm/mach-omap2/board-generic.c:30:123: error: 'omap_generic_init' defined but not used [-Werror=unused-function]
This annotates both as __maybe_unused to shut up that warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
The osk_mistral_init() contains code that is only compiled when
CONFIG_PM is set, but it uses a variable that is declared outside
of the #ifdef:
arch/arm/mach-omap1/board-osk.c: In function 'osk_mistral_init':
arch/arm/mach-omap1/board-osk.c:513:7: warning: unused variable 'ret' [-Wunused-variable]
This removes the #ifdef around the user of the variable,
make it always used.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Tony Lindgren <tony@atomide.com>
Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi>
sirfsoc_init_late is called by each of the three individual
SoC definitions, but in a randconfig build, we can encounter
a situation where they are all disabled:
arch/arm/mach-prima2/common.c:18:123: warning: 'sirfsoc_init_late' defined but not used [-Wunused-function]
While that is not a useful configuration, the warning also
doesn't help, so this patch marks the function as __maybe_unused
to let the compiler know it is there intentionally.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
ixp4xx defines the arguments to its __indirect_writesb() and other
functions as pointers to fixed-size data. This is not necessarily
wrong, and it works most of the time, but it causes warnings in
at least one driver:
drivers/net/ethernet/smsc/smc91x.c: In function 'smc_rcv':
drivers/net/ethernet/smsc/smc91x.c:495:21: error: passing argument 2 of '__indirect_readsw' from incompatible pointer type [-Werror=incompatible-pointer-types]
SMC_PULL_DATA(lp, data, packet_len - 4);
All other definitions of the same functions pass void pointers,
so doing the same here avoids the warnings.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Krzysztof Halasa <khalasa@piap.pl>
The modem pm handler in the ams-delta board uses regulator_enable()
but does not check for a successful return code:
board-ams-delta.c:521:3: error: ignoring return value of 'regulator_enable', declared with attribute warn_unused_result [-Werror=unused-result]
It is not easy to propagate that return code to the callers in
uart_configure_port/uart_suspend_port/uart_resume_port, unless
we change all UART drivers, and it is unclear what those would
do with the return code.
Instead, this patch uses a runtime warning to replace the
compiletime warning. I have checked that the regulator in question
is hardcoded to a fixed-voltage GPIO regulator, and that should
never fail to get enabled if I understand the code right.
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Link: https://patchwork.kernel.org/patch/8391981/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The RAM_SIZE macro in mach/hardware.h conflicts with macros of
the same name in multiple drivers, leading to annoying build warnings:
In file included from drivers/net/ethernet/cirrus/cs89x0.c:79:0:
drivers/net/ethernet/cirrus/cs89x0.h:324:0: error: "RAM_SIZE" redefined [-Werror]
#define RAM_SIZE 0x1000 /* The card has 4k bytes or RAM */
^
In file included from /git/arm-soc/arch/arm/mach-rpc/include/mach/io.h:16:0,
from /git/arm-soc/arch/arm/include/asm/io.h:194,
from /git/arm-soc/include/linux/scatterlist.h:8,
from /git/arm-soc/include/linux/dmaengine.h:24,
from /git/arm-soc/include/linux/netdevice.h:38,
from /git/arm-soc/drivers/net/ethernet/cirrus/cs89x0.c:54:
arch/arm/mach-rpc/include/mach/hardware.h:28:0: note: this is the location of the previous definition
#define RAM_SIZE 0x10000000
We don't use RAM_SIZE/RAM_START at all, so we could just remove
them, but it might be nice to leave them for documentation purposes,
so this renames them to RPC_RAM_SIZE/RPC_RAM_START in order to
avoid the build warnings
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
w90x900 still provides its own variant of the clk API rather than using
the generic COMMON_CLK API. This generally works, but it causes some link
errors with drivers using the clk_set_rate, clk_get_parent, clk_set_parent
or clk_round_rate functions when a platform lacks those interfaces.
This adds empty stub implementations for each of them, and I don't even
try to do something useful here but instead just print a WARN() message
to make it obvious what is going on if they ever end up being called.
The drivers that call these won't be used on these platforms (otherwise
we'd get a link error today), so the added code is harmless bloat and
will warn about accidental use.
A while ago there was a proposal to change w90x900 to use the common-clk
implementation, which would be the way it should be handled properly.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
It's a combination of the patch from Arnd Bergmann, which added empty stubs
for clk_round_rate() and clk_set_parent() and a working trivial
implementation of clk_get_parent(). The later is required for ADC driver.
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull parisc fixes from Helge Deller:
- The majority of lines changed are due to regenerated defconfig files.
- The support for the Page Deallocation Table (PDT) which was merged in
the merge window for 4.13 contained a bug which crashes the kernel if
a bad page is reported by firmware. This is now fixed and the kernel
messages will show which memory slot holds the broken DIMM.
- Commit 3a166fc2d4 ("kbuild: handle libs-y archives separately from
built-in.o archives") broke linking the parisc kernel due to
millicode symbols which can't be reached then any longer. This was
fixed by modifying the parisc vmlinux.lds linker script.
- If the stack checker panics on stack overflow, avoid recursive
panics.
- Some parisc machines can't physically power off and thus instead
start after some time to flood the console by presumably detected
soft lockups. Avoid this by disabling the lockup detectors before
entering the endless for-next loop.
- Dave Anglin provided fixes which prevents TLB speculation on flushed
pages on PA8800/PA9000 CPUs.
- Arvind Yadav sent a trivial patch to constify the attribute_group
structure in our firmware on-board-flash storage driver
(pdc_stable.c)
* 'parisc-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Extend disabled preemption in copy_user_page
parisc: Prevent TLB speculation on flushed pages on CPUs that only support equivalent aliases
parisc: Suspend lockup detectors before system halt
parisc: Show DIMM slot number which holds broken memory module
parisc: Add function to return DIMM slot of physical address
parisc: Fix crash when calling PDC_PAT_MEM PDT firmware function
parisc: regenerate defconfig files
parisc: pdc_stable: constify attribute_group structures.
parisc: Merge millicode routines via linker script
parisc: Disable further stack checks when panic occurs during stack check
Pull ARM fixes from Russell King:
"Two areas addressed by these fixes:
- Fixes from Dave Martin for the signal frames that were broken with
certain configurations. No one noticed until recently.
- More kexec fixes to ensure that the crashkernel region is correctly
allocated, and a fix for the location of the device tree when
several kexec kernels are loaded"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8687/1: signal: Fix unparseable iwmmxt_sigframe in uc_regspace[]
ARM: 8686/1: iwmmxt: Add missing __user annotations to sigframe accessors
ARM: kexec: fix failure to boot crash kernel
ARM: kexec: avoid allocating crashkernel region outside lowmem
Now that the CCU device tree binding headers have been merged, we can
use the properly named macros in the device tree, instead of raw
numbers.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
The datasheet said that emac register size is 0x10000 not 0x100
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
[wens@csie.org: Fixed commit subject prefix]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
The datasheet said that emac register size is 0x10000 not 0x104
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
[wens@csie.org: Fixed commit subject prefix]
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Since the PMU register interface is banked per CPU, CPU PMU interrrupts
cannot be handled by a CPU other than the one with the PMU asserting the
interrupt. This means that migrating PMU SPIs, as we do during a CPU
hotplug operation doesn't make any sense and can lead to the IRQ being
disabled entirely if we route a spurious IRQ to the new affinity target.
This has been observed in practice on AMD Seattle, where CPUs on the
non-boot cluster appear to take a spurious PMU IRQ when coming online,
which is routed to CPU0 where it cannot be handled.
This patch passes IRQF_PERCPU for PMU SPIs and forcefully sets their
affinity prior to requesting them, ensuring that they cannot
be migrated during hotplug events. This interacts badly with the DB8500
erratum workaround that ping-pongs the interrupt affinity from the handler,
so we avoid passing IRQF_PERCPU in that case by allowing the IRQ flags
to be overridden in the platdata.
Fixes: 3cf7ee98b8 ("drivers/perf: arm_pmu: move irq request/free into probe")
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
sa1100 provides its own variant of the clk API rather than using the
generic COMMON_CLK API. This generally works, but it causes some link
errors with drivers using the clk_set_rate, clk_get_parent, clk_set_parent
or clk_round_rate functions when a platform lacks those interfaces.
This adds trivial stub implementations for each of them, based on
the behavior of the COMMON_CLK implementation:
- set_rate() and set_parent() report success without doing anything
- round_rate() returns the clk rate
- get_parent() returns NULL.
This adds the minimal bloat and should do the right thing for
the simple clock hardware in this SoC.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
davinci still has its own clk implementation, but lacks
a clk_get_parent() helper, which can lead to link errors
in randconfig builds.
This adds the usual implementation.
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
In commit 3169663ac5 "ARM: sa11x0/pxa: convert OS timer registers
to IOMEM", the definition of the OSCR macro was changed to be an
__iomem pointer, but the same register is also used by the XIP
code. This patch does the corresponding change here as well.
On PXA, the IRQ register definitions were removed even earlier, in
commit 5d284e353e ("ARM: pxa: avoid accessing interrupt registers
directly"). This patch unfortunately brings some of that back. An
earlier version of my patch moved the code into an external function,
which could not work for CONFIG_XIP_KERNEL+CONFIG_MTD_XIP, so this
restores something close to the original code.
Link: http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241716.html
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
A change to the platform data definitions caused a warning in the board code:
arch/arm/mach-davinci/board-da850-evm.c:1221:13: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
arch/arm/mach-davinci/board-da850-evm.c:1231:13: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
This is a bit unfortunate, since we generally like structure definitions to
be const, but as this is legacy code, the easiest way out is still to
remove the 'const' annotation here.
Fixes: 4a5f8ae50b ("[media] davinci: vpif_capture: get subdevs from DT when available")
Fixes: 231ce279e6 ("ARM: davinci: fix const warnings")
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Drop the unused endpoints. They should only be used when there is an
actual remote-endpoint connected.
Cc: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Drop the unused endpoints. They should only be used when there is
an actual remote-endpoint connected.
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
- Fix wrong irq type for gpio expeander on Armada 388 GP
- Use __pa_symbol instead of virt_to_phys in the mv98dx3236 platform
SMP code
-----BEGIN PGP SIGNATURE-----
iIEEABECAEEWIQQYqXDMF3cvSLY+g9cLBhiOFHI71QUCWW3X4iMcZ3JlZ29yeS5j
bGVtZW50QGZyZWUtZWxlY3Ryb25zLmNvbQAKCRALBhiOFHI71e78AJ4+wNdgOn6C
8PVxz97Ngx2L0O44AgCfXWXbhHye3y4p2wfLfFTBT1prMZ4=
=gydP
-----END PGP SIGNATURE-----
Merge tag 'mvebu-fixes-4.13-1' of git://git.infradead.org/linux-mvebu into fixes
Pull "mvebu fixes for 4.13 (part 1)" from Gregory CLEMENT:
- Fix wrong irq type for gpio expeander on Armada 388 GP
- Use __pa_symbol instead of virt_to_phys in the mv98dx3236 platform
SMP code
* tag 'mvebu-fixes-4.13-1' of git://git.infradead.org/linux-mvebu:
ARM: dts: armada-38x: Fix irq type for pca955
ARM: mvebu: use __pa_symbol in the mv98dx3236 platform SMP code
Add necessary parent clocks for audss (Audio SubSystem, MAUDIO) clock
controller block.
This allows driver to keep EPLL enabled before accessing any MAUDIO
registers thus fixing silent hang. This silent hang appeared with
commit 6edfa11cb3 ("clk: samsung: Add enable/disable operation for
PLL36XX clocks"), e.g. on Odroid U3 usually with last (but unrelated)
messages:
[ 2.382741] input: gpio_keys as /devices/platform/gpio_keys/input/input0
[ 2.405686] usb 1-3: new high-speed USB device number 3 using exynos-ehci
[ 2.419843] max77686-rtc max77686-rtc: setting system clock to 2017-06-21 17:04:13 UTC (1498064653)
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Correct order of sound clock frequencies for ULCB boards
used by r8a7795 and r8a7796 SoCs.
These sounds clock frequencies are used as the ADG clock (output clocks
for audio module) initial setting and sound codec's initial system clock
which needs the maximum clock frequency. Thus, descending order is
required.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZaHmuAAoJENfPZGlqN0++XmQP/i7HumkEn4LPeyEQ/ePTwOHs
zz6CNekfmW0aJe4lceRmZ/xMkIuzrwma+EsQj3xj35cl1iK3MvjkguVYAFeYVhJb
w0X0qJp/0Opz0RNlKo4Pufj0yWy1ZyAQ9ci3UZELTik1sJI/7LG9eyjrXfaxRywo
OrYq3l41t9Y3/5e9MThUP4oxLO5B8YaJDtGGVYOoVk7cIs0PM9ffG8GVz0IPbrmF
wFEbVr/EohjpQL0SJ5GO2+NJhUbJFRuLOBWZOTyBQiZ3GByloRXzDR6gU/XSczHB
vElss4KZiWp2k7xRaBgaHK88u8wSOKdDkXrl6pto+aw1gAlFtOgWArKNQtkrp0vd
+VHk7Emhdnsw9SWTz0N+Oq7fyA2QIWo1sBr16AkmgB/JStlOXklJWjKgtyK0+mLj
FZAhOz5qO9ctr0WO87gbVvTz9xcs81g/K0D9rFTlmEIF8GEuDZxrX43LlIDl7I9r
nbfBj/HFtXqQXLIzS2OR9ah2F5DFWMym2/+mYuSkOt883PrbyjBq2xNdDD+QYcnO
+p6xCHAY/xiT953NBs/2z0kAXp6bWMjEOO7Lsv3SnPUss7Nng6RU2wbJepZd05Lo
ZH0k1L7IJtq2UVduHFYkEwKzTGCycW4mmposVNHZm42FOw19nR6Xd3khU57fMAni
ZH59p+Q/lFxcddwDZ0T1
=bjUb
-----END PGP SIGNATURE-----
Merge tag 'renesas-fixes2-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Pull "Second Round of Renesas ARM Based SoC Fixes for v4.13" from Simon Horman:
Correct order of sound clock frequencies for ULCB boards
used by r8a7795 and r8a7796 SoCs.
These sounds clock frequencies are used as the ADG clock (output clocks
for audio module) initial setting and sound codec's initial system clock
which needs the maximum clock frequency. Thus, descending order is
required.
* tag 'renesas-fixes2-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: dts: renesas: ulcb: sound clock-frequency needs descending order
- Fix disable_irq related shared IRQ warnings for omap3 PRM
- Fix omap4 legacy code regression that accidentally removed code that
we still need for PRM interrupts
- Fix dm8168-evm NAND pins and MMC write protect pin direction
- Fix dra71-evm mdio impedance values
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAlld8MERHHRvbnlAYXRv
bWlkZS5jb20ACgkQG9Q+yVyrpXPnxRAAkGl1Q0HYGFl6+0YCLzk4jAD8Tx3T8kS6
Y+zaXkUowyQj548Sx55hSbCfxIyJETF2VBFWbMVMeNaDE1Gge+2rj6NdELAP1olu
cOIuRlJt4R6xBXRhJKMu4yA2DqslOD6C2rNM6B0qmNFoO+O4/9uhYZEAD03/RFRJ
8EXvr5lGjCfljEC/EgN7kfyPCOjU4aknbREZOUknQ/CMYm53G6FrNzppNNe0CVmj
59/Pp6lgYVPrBNXMpcD1LXu8fkOcp6vQ70h+WFLrdaf1GOZSj7SDHD50Lk2tNnN6
UWJ3aRE3cGnaNBwcnnrMkwMyIWBU6mEWSAOl/bQgf/jFiyFcgqDk+/i9POqGqzou
p3Jyn9of8Cg3SoIbImOsPODMgtesDmxy36nYcR+EH6x/mGDnwFzhxR74RB/X2ltq
RoeT8RbJxckz+h80Zjp6IbJQoJ6CcNqtPDiaf/GqsloSEYcUh51CBrP0teT51nFU
VZk9PRRsasIc7mpgyDg8jODZCtzvK0KTyTU3l6b6Oct7o8b/BOMAunKTAvujsT5Q
gqFCIJZ/iNTZ5GN8I6qEOTD4um5NFDl//tiQlVbU4/t/k2eUzlHErl0sc+ycv+N5
pRmz843Zs7U3lSithCF4/840BG7dQ0cTbx9binYtSCUOI9wKKORWz6ieKOf7mZi2
TR/Sqx+i0dI=
=ibLi
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.13/fixes-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Pull "Few fixes for omaps for issues found recently" from Tony Lindgren:
- Fix disable_irq related shared IRQ warnings for omap3 PRM
- Fix omap4 legacy code regression that accidentally removed code that
we still need for PRM interrupts
- Fix dm8168-evm NAND pins and MMC write protect pin direction
- Fix dra71-evm mdio impedance values
* tag 'omap-for-v4.13/fixes-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: dra71-evm: mdio: Fix impedance values
ARM: dts: dm816x: Correct the state of the write protect pin
ARM: dts: dm816x: Correct NAND support nodes
ARM: OMAP4: Fix legacy code clean-up regression
ARM: OMAP2+: Fix omap3 prm shared irq
Correct order of sound clock frequencies for Salvator boards
used by r8a7795 and r8a7796 SoCs.
These sounds clock frequencies are used as the ADG clock (output clocks
for audio module) initial setting and sound codec's initial system clock
which needs the maximum clock frequency. Thus, descending order is
required.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZVhaCAAoJENfPZGlqN0++/zkP/1aHxFA560O3cb4SebIeJKQd
TQFd7YocK1lsqD6DP1ZLQI7ytPfpyAKPctKLRe2NSwr+kf5N8rbYUfZ88pBXe7ez
y9mbFBYcKHvDLrGHtW5hebBGB6HAh2Ll9DMW6Hqdsv8qG1/EqlLwBkA10/VSjaYk
Xfp4XnbyXsWCxbllU04We0uel945rg6tQy6VBszW5QEwMEX4zhaOsmi6z8nCv11E
k8Q6xl2/+FWc/zfivJSzBn59XjQSo+E54yRkmaYcteTY2+8Y787QMvVKkbUAvnHi
DrVplSPdbQPeZInqDiw97fvUNDTIzYatskMF3Lv6HFZ+u8YHTrW6yq2CShWtecUx
7BP8GgwfkLHt8J5PADMpNejzfE5lz+7EPm8+2jhGeoHd0BNjojaFvbyL01pJntwp
nJrSKlLmRB1CDEiubVPUUrY+EWcx7wxK0V8eliWTAn/o1ijJTErbRF91Z/U9BACF
PQUToBZ8MJKbjbrFXmelswkTwORV7au0KwY6Jzm2Vue7TiCZU4BXwyxZ1hmH3dI1
GLOWo5mtlXWcUnHLrPv44x6povNQDCtPGowTyKkqo2vXIAlAkMaG4V5D2tV9ykr7
d0UDWA44fF7QYcgW8hEd1eY0ILnYuX7Zanh0xvb5OBjqLEwsf+FvLel6/eMzg4Pj
tNB+iosh5YumUfCIue0D
=t800
-----END PGP SIGNATURE-----
Merge tag 'renesas-fixes-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Pull "Renesas ARM Based SoC Fixes for v4.13" from Simon Horman:
Correct order of sound clock frequencies for Salvator boards
used by r8a7795 and r8a7796 SoCs.
These sounds clock frequencies are used as the ADG clock (output clocks
for audio module) initial setting and sound codec's initial system clock
which needs the maximum clock frequency. Thus, descending order is
required.
* tag 'renesas-fixes-for-v4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: renesas: salvator-common: sound clock-frequency needs descending order
Fixes: dad6f37c26 ("powerpc: subpage_protect: Increase the array size to take care of 64TB")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We need to hold the srcu lock when accessing memory slots
during migration
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJZd0O1AAoJEBF7vIC1phx8WOIQAKLLmnqmKfD5sy3hIxkRUHG7
WlN33M9tNCjDa1gL6iurBqKbOHt7phBhZdp/Kzuy/UxwPtM5vMaajQv1mLKl7b2y
1GKEt2aTMJ3e/ql3Z3kVOASNguU2NHjVuzu+BObLhfzBXmNYp4hkyrMSkegb4Q2v
j6Kfhpz/ltrzCYhL1/WYquKwG/aAV0fDefbfSrZZ0yx1TCnbtAEfbqBO6LXfys2t
M3LW2OQwavV0YYrrwF4o7Seg0YfpQVYVH1j1/0dLd/Mb1w1D5/CB+gTr6zYhZ6z2
MzZAa5ADRJ1y+YEoB6aB8T95/4i9Aw3fYhPYfw4ijUJcHUwSbzllSt2brIUSkEZo
rajClCRjAMaMfeg2q5SZ6Mbg/wt5uo9O0hRG9weqZQR+NCnYkcrfgTe2kYSrorAs
Msp6CUSfP9cOL9JETD+f3hEWl4uqA1P21n6Ga/v5ERPDxqIrqU49mlgjxBKRcPK4
wHKaBrA4h53yOjNQ5U/Bdr130W2dQnRHmYAQXuUM9OjbBOliTbtgf8ND0U15S9TY
HZomb3kEkL1o8J5rH6NBxxntoU48SEtuppBDSIAYBnp9h02F+RvNPqMr4orFHfD/
OJe0zJHYobK44kz59DGmN5ed5wZl+Vzvdf7KAa62MotKmDMykbkWbcb9j+gcUFHf
Yal+Mk51xCKPUcadGD/e
=PjH2
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-master-4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: fixup missing srcu lock
We need to hold the srcu lock when accessing memory slots
during migration
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1:
Before NMI IRET test
Sending NMI to self
NMI isr running stack 0x461000
Sending nested NMI to self
After nested NMI to self
Nested NMI isr running rip=40038e
After iret
After NMI to self
FAIL: NMI
Commit 4c4a6f790e (KVM: nVMX: track NMI blocking state separately
for each VMCS) tracks NMI blocking state separately for vmcs01 and
vmcs02. However it is not enough:
- The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault
on IRET, so the L2 can generate #PF which can be intercepted by L0.
- L0 walks L1's guest page table and sees the mapping is invalid, it
resumes the L1 guest and injects the #PF into L1. At this point the
vmcs02 has nmi_known_unmasked=true.
- L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field
of vmcs12 (and fixes the shadow page table) before resuming L2 guest.
- L1 executes VMRESUME to resume L2, causing a vmexit to L0
- during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the
interruptibility-state field of vmcs02, but nmi_known_unmasked is
still true.
- L2 immediately exits to L0 with another page fault, because L0 still has
not updated the NGVA->HPA page tables. However, nmi_known_unmasked is
true so vmx_recover_nmi_blocking does not do anything.
The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The PI vector for L0 and L1 must be different. If dest vcpu0
is in guest mode while vcpu1 is delivering a non-nested PI to
vcpu0, there wont't be any vmexit so that the non-nested interrupt
will be delayed.
Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are using the same vector for nested/non-nested posted
interrupts delivery, this may cause interrupts latency in
L1 since we can't kick the L2 vcpu out of vmx-nonroot mode.
This patch introduces a new vector which is only for nested
posted interrupts to solve the problems above.
Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts the change of commit f85c758dbe,
as the behavior it modified was intended.
The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual
says:
Table 4-7. Use of CR3 with PAE Paging
Bit Position(s) Contents
4:0 Ignored
31:5 Physical address of the 32-Byte aligned
page-directory-pointer table used for linear-address
translation
63:32 Ignored (these bits exist only on processors supporting
the Intel-64 architecture)
To placate the static checker, write the mask explicitly as an
unsigned long constant instead of using a 32-bit unsigned constant.
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: f85c758dbe
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
write_sysreg() may misparse the value argument because it is used
without parentheses to protect it.
This patch adds the ( ) in order to avoid any surprises.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
[will: same change to write_sysreg_s]
Signed-off-by: Will Deacon <will.deacon@arm.com>
In commit efe0160cfd ("powerpc/64: Linker on-demand sfpr functions
for modules"), we added an ld version check early in the powerpc
top-level Makefile.
Because the Makefile runs before the kernel config is setup, the
checks for CONFIG_CPU_LITTLE_ENDIAN etc. all take the default case. So
we end up configuring ld for 32-bit big endian.
That would be OK, except that for historical (or perhaps no) reason,
we use 'override LD' to add the endian flags to the LD variable
itself, rather than the normal approach of adding them to LDFLAGS.
The end result is that when we check the ld version we run it as:
$(CROSS_COMPILE)ld -EB -m elf32ppc --version
This often works, unless you are using a 64-bit only and/or little
endian only, toolchain. In which case you see something like:
$ make defconfig
powerpc64le-linux-ld: unrecognised emulation mode: elf32ppc
Supported emulations: elf64lppc elf32lppc elf32lppclinux elf32lppcsim
/bin/sh: 1: [: -ge: unexpected operator
The proper fix is to stop using 'override LD', but that will require a
fair bit of testing. Instead we can fix it for now just by reordering
the Makefile to do the version check earlier.
Fixes: efe0160cfd ("powerpc/64: Linker on-demand sfpr functions for modules")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
As for commit 68baf692c4 ("powerpc/pseries: Fix of_node_put()
underflow during DLPAR remove"), the call to of_node_put() must be
removed from pSeries_reconfig_remove_node().
dlpar_detach_node() and pSeries_reconfig_remove_node() both call
of_detach_node(), and thus the node should not be released in both
cases.
Fixes: 0829f6d1f6 ("of: device_node kobject lifecycle fixes")
Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There's a somewhat architectural issue with Radix MMU and KVM.
When coming out of a guest with AIL (Alternate Interrupt Location, ie,
MMU enabled), we start executing hypervisor code with the PID register
still containing whatever the guest has been using.
The problem is that the CPU can (and will) then start prefetching or
speculatively load from whatever host context has that same PID (if
any), thus bringing translations for that context into the TLB, which
Linux doesn't know about.
This can cause stale translations and subsequent crashes.
Fixing this in a way that is neither racy nor a huge performance
impact is difficult. We could just make the host invalidations always
use broadcast forms but that would hurt single threaded programs for
example.
We chose to fix it instead by partitioning the PID space between guest
and host. This is possible because today Linux only use 19 out of the
20 bits of PID space, so existing guests will work if we make the host
use the top half of the 20 bits space.
We additionally add support for a property to indicate to Linux the
size of the PID register which will be useful if we eventually have
processors with a larger PID space available.
There is still an issue with malicious guests purposefully setting the
PID register to a value in the hosts PID range. Hopefully future HW
can prevent that, but in the meantime, we handle it with a pair of
kludges:
- On the way out of a guest, before we clear the current VCPU in the
PACA, we check the PID and if it's outside of the permitted range
we flush the TLB for that PID.
- When context switching, if the mm is "new" on that CPU (the
corresponding bit was set for the first time in the mm cpumask), we
check if any sibling thread is in KVM (has a non-NULL VCPU pointer
in the PACA). If that is the case, we also flush the PID for that
CPU (core).
This second part is needed to handle the case where a process is
migrated (or starts a new pthread) on a sibling thread of the CPU
coming out of KVM, as there's a window where stale translations can
exist before we detect it and flush them out.
A future optimization could be added by keeping track of whether the
PID has ever been used and avoid doing that for completely fresh PIDs.
We could similarily mark PIDs that have been the subject of a global
invalidation as "fresh". But for now this will do.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop
unneeded include of kvm_book3s_asm.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- split the global dma coherent pool from the per-device pool.
This fixes a regression in the earlier 4.13 pull requests where the
global pool would override a per-device CMA pool. (Vladimir Murzin).
-----BEGIN PGP SIGNATURE-----
iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAll3l1sLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYN8BhAAqFxy2CrpEBk7gD2byOi9M4kTeXDYCESEoEAwuvTG
Fesbw5zumliBR2cjt/qk/uIDZ93fP4BuHn89NtIfcGOD1LqYOyIPwUTpmb9AgicD
y4eO1Gy/3DrG2haZcWYmDvq8yfSuR01H3ecY1KNsX1Y2kXxeBQfVKaUDR6fuix4+
uCf98LzIWs3TYmj7h48LVB/oNnigvs0oljrB2dWrWVJHbgGYEpmdPjBEe6r95e5U
5cHtPno5JA1lbBFt/nvsZl/NmzSd745SL3QwJsaVmSTf7oYnAuwyPI+5gqaoeQT6
24947e8hJjuLhBpO7RiqnJY9QdPxT0XKclkCcjnRb5j3dB9KL09f9Dz60exyJzSe
18V8+8+1m1BgvPsAOS/pLKYxKr9Kgzl9LFrFQaBkA5+7SPlywfV7HqaCkN/mKB4F
XJoQyRDLlZiDStDKbrhGEAHG6oYaZXnkpQ5xDitSXcSkh9/2a/elsG3caUBRI5qP
vKC0qvfBPjnHa/3lYNNoLgADB4tZCE3rRrVP6tqdHQbjuNUNK1wLNT7PiMfeoUVj
Oqql4le0AKlsxO4vRjavOrtaW1bVT+eAYLEtdQfXWQDvhffriEW6r6I8PGqIOiCO
OzxemCG2M6fcD9ho/VDpjo3Ei6tZylrxdTbrsm7ogQmo/U3ID9cfs452vIOYtCcB
9so=
=fJWP
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-4.13-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma mapping fixes from Christoph Hellwig:
"split the global dma coherent pool from the per-device pool.
This fixes a regression in the earlier 4.13 pull requests where the
global pool would override a per-device CMA pool (Vladimir Murzin)"
* tag 'dma-mapping-4.13-2' of git://git.infradead.org/users/hch/dma-mapping:
ARM: NOMMU: Wire-up default DMA interface
dma-coherent: introduce interface for default DMA pool
It's always bothered me that we only disable preemption in
copy_user_page around the call to flush_dcache_page_asm.
This patch extends this to after the copy.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
Helge noticed that we flush the TLB page in flush_cache_page but not in
flush_cache_range or flush_cache_mm.
For a long time, we have had random segmentation faults building
packages on machines with PA8800/8900 processors. These machines only
support equivalent aliases. We don't see these faults on machines that
don't require strict coherency. So, it appears TLB speculation
sometimes leads to cache corruption on machines that require coherency.
This patch adds TLB flushes to flush_cache_range and flush_cache_mm when
coherency is required. We only flush the TLB in flush_cache_page when
coherency is required.
The patch also optimizes flush_cache_range. It turns out we always have
the right context to use flush_user_dcache_range_asm and
flush_user_icache_range_asm.
The patch has been tested for some time on rp3440, rp3410 and A500-44.
It's been boot tested on c8000. No random segmentation faults were
observed during testing.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
Some machines can't power off the machine, so disable the lockup detectors to
avoid this watchdog BUG to show up every few seconds:
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd-shutdow:1]
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.9+
The Page Deallocation Table (PDT) holds the physical addresses of all broken
memory addresses. With the physical address we now are able to show which DIMM
slot (e.g. 1a, 3c) actually holds the broken memory module so that users are
able to replace it.
Signed-off-by: Helge Deller <deller@gmx.de>
Add a firmware wrapper function, which asks PDC firmware for the DIMM slot of a
physical address. This is needed to show users which DIMM module needs
replacement in case a broken DIMM was encountered.
Signed-off-by: Helge Deller <deller@gmx.de>
Commit c9c2877d08 ("parisc: Add Page Deallocation Table (PDT) support")
introduced the pdc_pat_mem_read_pd_pdt() firmware helper function, which
crashed the system because it trashed the stack if the
pdc_pat_mem_read_pd_retinfo struct was located on the stack (and which is
in size less than the required 32 64-bit values).
Fix it by using the pdc_result struct instead when calling firmware and copy
the return values back into the result struct when finished sucessfully.
While debugging this code I noticed that the pdc_type wasn't set correctly
either, so let's fix that too.
Fixes: c9c2877d08 ("parisc: Add Page Deallocation Table (PDT) support")
Signed-off-by: Helge Deller <deller@gmx.de>
Pull s390 fixes from Martin Schwidefsky:
"Three bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: set change and reference bit on lazy key enablement
s390: chp: handle CRW_ERC_INIT for channel-path status change
s390/perf: fix problem state detection
kvm_pmu_overflow_set() is called from perf's interrupt handler,
making the call of kvm_vgic_inject_irq() from it introduced with
"KVM: arm/arm64: PMU: remove request-less vcpu kick" a really bad
idea, as it's quite easy to try and retake a lock that the
interrupted context is already holding. The fix is to use a vcpu
kick, leaving the interrupt injection to kvm_pmu_sync_hwstate(),
like it was doing before the refactoring. We don't just revert,
though, because before the kick was request-less, leaving the vcpu
exposed to the request-less vcpu kick race, and also because the
kick was used unnecessarily from register access handlers.
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
When EFI runtime services are disabled, for example by the "noefi"
kernel cmdline parameter, the reboot_type could still be set to
BOOT_EFI causing reboot to fail.
Fix this by checking if EFI runtime services are enabled.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170724122248.24006-1-sassmann@kpanic.de
[ Fixed 'not disabled' double negation. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
undef memcpy() and friends in boot/string.c so that the functions
defined here will have the correct names, otherwise we end up
up trying to redefine __builtin_memcpy() etc.
Surprisingly, GCC allows this (and, helpfully, discards the
__builtin_ prefix from the function name when compiling it),
but clang does not.
Adding these #undef's appears to preserve what I assume was
the original intent of the code.
Signed-off-by: Michael Davidson <md@google.com>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bernhard.Rosenkranzer@linaro.org
Cc: Greg Hackmann <ghackmann@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170724235155.79255-1-mka@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The optional prefetch instructions in the copy_page() routine are
inconsistent: at the start of the function, two cachelines are
prefetched beyond the one being loaded in the first iteration, but
in the loop, the prefetch is one more line ahead. This appears to
be unintentional, so let's fix it.
While at it, fix the comment style and white space.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the
signal frame can be left partially uninitialised in such a way
that userspace cannot parse uc_regspace[] safely. In particular,
this means that the VFP registers cannot be located reliably in the
signal frame when a multi_v7_defconfig kernel is run on the
majority of platforms.
The cause is that the uc_regspace[] is laid out statically based on
the kernel config, but the decision of whether to save/restore the
iWMMXt registers must be a runtime decision.
To minimise breakage of software that may assume a fixed layout,
this patch emits a dummy block of the same size as iwmmxt_sigframe,
for non-iWMMXt threads. However, the magic and size of this block
are now filled in to help parsers skip over it. A new DUMMY_MAGIC
is defined for this purpose.
It is probably legitimate (if non-portable) for userspace to
manufacture its own sigframe for sigreturn, and there is no obvious
reason why userspace should be required to insert a DUMMY_MAGIC
block when running on non-iWMMXt hardware, when omitting it has
worked just fine forever in other configurations. So in this case,
sigreturn does not require this block to be present.
Reported-by: Edmund Grimley-Evans <Edmund.Grimley-Evans@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
preserve_iwmmxt_context() and restore_iwmmxt_context() lack __user
accessors on their arguments pointing to the user signal frame.
There does not be appear to be a bug here, but this omission is
inconsistent with the crunch and vfp sigframe access functions.
This patch adds the annotations, for consistency.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The following commit:
003002e04e ("kprobes: Fix arch_prepare_kprobe to handle copy insn failures")
returns an error if the copying of the instruction, but does not release
the allocated insn_slot.
Clean up correctly.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 003002e04e ("kprobes: Fix arch_prepare_kprobe to handle copy insn failures")
Link: http://lkml.kernel.org/r/150064834183.6172.11694375818447664416.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds two missing event extra regs for Skylake Server CHA PMU:
- TOR_INSERTS
- TOR_OCCUPANCY
Were missing support for all the filters, including opcode matchers.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1499967350-10385-6-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is no field c6 and link for CHA BOX FILTER.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1499967350-10385-5-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
PCU event format for SKX are different from snbep. Introduce a new
format group for SKX PCU.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1499967350-10385-3-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch fixes the event_mask and event_ext_mask for the Intel Skylake
Server UPI PMU. Bit 21 is not used as a filter. The extended umask is
from bit 32 to bit 55. Correct both umasks.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1499967350-10385-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit f98a8bf9ee ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB
ioctl() to change HPT size", 2016-12-20) changed the behaviour of
the KVM_PPC_ALLOCATE_HTAB ioctl so that it now allocates a new HPT
and new revmap array if there was a previously-allocated HPT of a
different size from the size being requested. In this case, we need
to reset the rmap arrays of the memslots, because the rmap arrays
will contain references to HPTEs which are no longer valid. Worse,
these references are also references to slots in the new revmap
array (which parallels the HPT), and the new revmap array contains
random contents, since it doesn't get zeroed on allocation.
The effect of having these stale references to slots in the revmap
array that contain random contents is that subsequent calls to
functions such as kvmppc_add_revmap_chain will crash because they
will interpret the non-zero contents of the revmap array as HPTE
indexes and thus index outside of the revmap array. This leads to
host crashes such as the following.
[ 7072.862122] Unable to handle kernel paging request for data at address 0xd000000c250c00f8
[ 7072.862218] Faulting instruction address: 0xc0000000000e1c78
[ 7072.862233] Oops: Kernel access of bad area, sig: 11 [#1]
[ 7072.862286] SMP NR_CPUS=1024
[ 7072.862286] NUMA
[ 7072.862325] PowerNV
[ 7072.862378] Modules linked in: kvm_hv vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iw_cxgb3 mlx5_ib ib_core ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel i2c_opal nfsd auth_rpcgss oid_registry
[ 7072.863085] nfs_acl lockd grace sunrpc kvm_pr kvm xfs libcrc32c scsi_dh_alua dm_service_time radeon lpfc nvme_fc nvme_fabrics nvme_core scsi_transport_fc i2c_algo_bit tg3 drm_kms_helper ptp pps_core syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm dm_multipath i2c_core cxgb3 mlx5_core mdio [last unloaded: kvm_hv]
[ 7072.863381] CPU: 72 PID: 56929 Comm: qemu-system-ppc Not tainted 4.12.0-kvm+ #59
[ 7072.863457] task: c000000fe29e7600 task.stack: c000001e3ffec000
[ 7072.863520] NIP: c0000000000e1c78 LR: c0000000000e2e3c CTR: c0000000000e25f0
[ 7072.863596] REGS: c000001e3ffef560 TRAP: 0300 Not tainted (4.12.0-kvm+)
[ 7072.863658] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE,TM[E]>
[ 7072.863667] CR: 44082882 XER: 20000000
[ 7072.863767] CFAR: c0000000000e2e38 DAR: d000000c250c00f8 DSISR: 42000000 SOFTE: 1
GPR00: c0000000000e2e3c c000001e3ffef7e0 c000000001407d00 d000000c250c00f0
GPR04: d00000006509fb70 d00000000b3d2048 0000000003ffdfb7 0000000000000000
GPR08: 00000001007fdfb7 00000000c000000f d0000000250c0000 000000000070f7bf
GPR12: 0000000000000008 c00000000fdad000 0000000010879478 00000000105a0d78
GPR16: 00007ffaf4080000 0000000000001190 0000000000000000 0000000000010000
GPR20: 4001ffffff000415 d00000006509fb70 0000000004091190 0000000ee1881190
GPR24: 0000000003ffdfb7 0000000003ffdfb7 00000000007fdfb7 c000000f5c958000
GPR28: d00000002d09fb70 0000000003ffdfb7 d00000006509fb70 d00000000b3d2048
[ 7072.864439] NIP [c0000000000e1c78] kvmppc_add_revmap_chain+0x88/0x130
[ 7072.864503] LR [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
[ 7072.864566] Call Trace:
[ 7072.864594] [c000001e3ffef7e0] [c000001e3ffef830] 0xc000001e3ffef830 (unreliable)
[ 7072.864671] [c000001e3ffef830] [c0000000000e2e3c] kvmppc_do_h_enter+0x84c/0x9e0
[ 7072.864751] [c000001e3ffef920] [d00000000b38d878] kvmppc_map_vrma+0x168/0x200 [kvm_hv]
[ 7072.864831] [c000001e3ffef9e0] [d00000000b38a684] kvmppc_vcpu_run_hv+0x1284/0x1300 [kvm_hv]
[ 7072.864914] [c000001e3ffefb30] [d00000000f465664] kvmppc_vcpu_run+0x44/0x60 [kvm]
[ 7072.865008] [c000001e3ffefb60] [d00000000f461864] kvm_arch_vcpu_ioctl_run+0x114/0x290 [kvm]
[ 7072.865152] [c000001e3ffefbe0] [d00000000f453c98] kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
[ 7072.865292] [c000001e3ffefd40] [c000000000389328] do_vfs_ioctl+0xd8/0x8c0
[ 7072.865410] [c000001e3ffefde0] [c000000000389be4] SyS_ioctl+0xd4/0x130
[ 7072.865526] [c000001e3ffefe30] [c00000000000b760] system_call+0x58/0x6c
[ 7072.865644] Instruction dump:
[ 7072.865715] e95b2110 793a0020 7b4926e4 7f8a4a14 409e0098 807c000c 786326e4 7c6a1a14
[ 7072.865857] 935e0008 7bbd0020 813c000c 913e000c <93a30008> 93bc000c 48000038 60000000
[ 7072.866001] ---[ end trace 627b6e4bf8080edc ]---
Note that to trigger this, it is necessary to use a recent upstream
QEMU (or other userspace that resizes the HPT at CAS time), specify
a maximum memory size substantially larger than the current memory
size, and boot a guest kernel that does not support HPT resizing.
This fixes the problem by resetting the rmap arrays when the old HPT
is freed.
Fixes: f98a8bf9ee ("KVM: PPC: Book3S HV: Allow KVM_PPC_ALLOCATE_HTAB ioctl() to change HPT size")
Cc: stable@vger.kernel.org # v4.11+
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
When compiling the 4.13-rc kernel I got those linker errors:
libgcc2.c:(.text+0x110): relocation truncated to fit: R_PARISC_PCREL22F against symbol `$$divU'
defined in .text.div section in /usr/lib/gcc/hppa64-linux-gnu/4.9.2/libgcc.a(_divU.o)
hppa64-linux-gnu-ld: /usr/lib/gcc/hppa64-linux-gnu/4.9.2/libgcc.a(_moddi3.o)(.text+0x174): cannot reach $$divU
Avoid such errors by bundling the millicode routines in the linker script.
Signed-off-by: Helge Deller <deller@gmx.de>
Before the irq handler detects a low stack and then panics the kernel, disable
further stack checks to avoid recursive panics.
Reported-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Commit dc6416f1d7 ("xen/x86: Call
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE) from xen_play_dead()")
introduced an error leading to a stack overflow of the idle task when
a cpu was brought offline/online many times: by calling
cpu_startup_entry() instead of returning at the end of xen_play_dead()
do_idle() would be entered again and again.
Don't use cpu_startup_entry(), but cpuhp_online_idle() instead allowing
to return from xen_play_dead().
Cc: <stable@vger.kernel.org> # 4.12
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
CONFIG_BOOTPARAM_HOTPLUG_CPU0 allows to offline CPU0 but Xen HVM guests
BUG() in xen_teardown_timer(). Remove the BUG_ON(), this is probably a
leftover from ancient times when CPU0 hotplug was impossible, it works
just fine for HVM.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Here are some small tty and serial driver fixes for 4.13-rc2. Nothing
huge at all, a revert of a patch that turned out to break things, a fix
up for a new tty ioctl we added in 4.13-rc1 to get the uapi definition
correct, and a few minor serial driver fixes for reported issues.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWXMebA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yn9MgCdGsIEQlTvTeG4QikxUPB7dkEJQacAoLWOKJzQ
A65mBAeOfiJztczi8uo4
=KEXp
-----END PGP SIGNATURE-----
Merge tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small tty and serial driver fixes for 4.13-rc2. Nothing
huge at all, a revert of a patch that turned out to break things, a
fix up for a new tty ioctl we added in 4.13-rc1 to get the uapi
definition correct, and a few minor serial driver fixes for reported
issues.
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: Fix TIOCGPTPEER ioctl definition
tty: hide unused pty_get_peer function
tty: serial: lpuart: Fix the logic for detecting the 32-bit type UART
serial: imx: Prevent TX buffer PIO write when a DMA has been started
Revert "serial: imx-serial - move DMA buffer configuration to DT"
serial: sh-sci: Uninitialized variables in sysfs files
serial: st-asc: Potential error pointer dereference
A bunch of small fixes for x86.
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJZcjkKAAoJEED/6hsPKofoo10H/3G0pYtaeKuEtD31hykSMyww
LaZG8+361eY3FD0X5SXJqWMuQXYXGOlbWcOSnArTRgdMIOaeHC50onJAD9sIX7T9
AywGO2RST3Gt83UfvCco47S8gJYs+gHzf5RaFTyvlSmJvDjPGmehjwyVwDWFcVkq
Pdry5jeQ2HvUgvJzphBb/UZHxb82v4haanReHzwRyvexdpApOp5WumgMPFBKMEc4
dVyJ0icgt9o+blSeNRInYezSwi98p0MHPP5xsD1gROXJxb7Hpu7iO3u1r3Z2dKaL
lil0999WCfCMl110ro57L7eC1izfa2B/klhpwo0Q/BJ+tniMfUQWJ54frXRw8XA=
=RzFO
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"A bunch of small fixes for x86"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: hyperv: avoid livelock in oneshot SynIC timers
KVM: VMX: Fix invalid guest state detection after task-switch emulation
x86: add MULTIUSER dependency for KVM
KVM: nVMX: Disallow VM-entry in MOV-SS shadow
KVM: nVMX: track NMI blocking state separately for each VMCS
KVM: x86: masking out upper bits
A handful of fixes, mostly for new code.
Some reworking of the new STRICT_KERNEL_RWX support to make sure we also remove
executable permission from __init memory before it's freed.
A fix to some recent optimisations to the hypercall entry where we were
clobbering r12, this was breaking nested guests (PR KVM).
A fix for the recent patch to opal_configure_cores(). This could break booting
on bare metal Power8 boxes if the kernel was built without
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG.
And finally a workaround for spurious PMU interrupts on Power9 DD2.
Thanks to:
Nicholas Piggin, Anton Blanchard, Balbir Singh.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZcctRAAoJEFHr6jzI4aWAji8P/RaG3/vGmhP4HGsAh7jHx/3v
EIBsjXPozQ16AIs4235JcQ2Vg54LLms6IaFfw1w9oYZxi98sNC3b56Rtd3AGxLbq
WHEGAD+mE9aXUOXkycLRkKZ+THiNzR0ZQKiMDqU+oJ9XbIAoof2gI4nc7UzUxGnt
EqxERiL2RrYqRBvO3W2ErtL29yXhopUhv+HKdbuMlIcoH4uooSdn435MPleIJWxl
WRFFv17DFw7PN6juwlff2YWvbTgrM1ND8czLfINzSZQDy0F+HEbKDFbtHlgAvOFN
GjbODX9OuU/QRrPQv6MoY2UumZNtLSCUsw5BUg0y4Qaak8p0gAEb4D/gmvIZNp/r
9ZqJDPfFKh/oA08apLTPJBKQmDUwCaYHyHVJNw10CU9GZ9CzW/2/B2/h2Egpgqao
vt0TgR3FXYizhXGAEpmBZzadAg9VKZTXH/irN6X62wMxNfvRIDOS57829QFHr0ka
wbsOK1gI1rE1g1GwsZafyURRSe95Sv84RDHW1fFQAtvFgpA3eHl26LUkpcvwOOFI
53g9zMedYOyZXXIdBn05QojwxgXZ0ry1Wc93oDHOn3rgDL6XBeOXFfFI5jIwPJDL
VZhOSvN1v+Ti3Q+I05zH+ll6BMhlU8RKBf+esq419DiLUeaNLHZR6GF0+UVJyJo+
WcLsJ1x/GDa9pOBfMEJy
=Oy4/
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"A handful of fixes, mostly for new code:
- some reworking of the new STRICT_KERNEL_RWX support to make sure we
also remove executable permission from __init memory before it's
freed.
- a fix to some recent optimisations to the hypercall entry where we
were clobbering r12, this was breaking nested guests (PR KVM).
- a fix for the recent patch to opal_configure_cores(). This could
break booting on bare metal Power8 boxes if the kernel was built
without CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG.
- .. and finally a workaround for spurious PMU interrupts on Power9
DD2.
Thanks to: Nicholas Piggin, Anton Blanchard, Balbir Singh"
* tag 'powerpc-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=y
powerpc/mm/hash: Refactor hash__mark_rodata_ro()
powerpc/mm/radix: Refactor radix__mark_rodata_ro()
powerpc/64s: Fix hypercall entry clobbering r12 input
powerpc/perf: Avoid spurious PMU interrupts after idle
powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()
Pull x86 fixes from Ingo Molnar:
"Half of the fixes are for various build time warnings triggered by
randconfig builds. Most (but not all...) were harmless.
There's also:
- ACPI boundary condition fixes
- UV platform fixes
- defconfig updates
- an AMD K6 CPU init fix
- a %pOF printk format related preparatory change
- .. and a warning fix related to the tlb/PCID changes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/devicetree: Convert to using %pOF instead of ->full_name
x86/platform/uv/BAU: Disable BAU on single hub configurations
x86/platform/intel-mid: Fix a format string overflow warning
x86/platform: Add PCI dependency for PUNIT_ATOM_DEBUG
x86/build: Silence the build with "make -s"
x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outsl
x86/fpu/math-emu: Avoid bogus -Wint-in-bool-context warning
x86/fpu/math-emu: Fix possible uninitialized variable use
perf/x86: Shut up false-positive -Wmaybe-uninitialized warning
x86/defconfig: Remove stale, old Kconfig options
x86/ioapic: Pass the correct data to unmask_ioapic_irq()
x86/acpi: Prevent out of bound access caused by broken ACPI tables
x86/mm, KVM: Fix warning when !CONFIG_PREEMPT_COUNT
x86/platform/uv/BAU: Fix congested_response_us not taking effect
x86/cpu: Use indirect call to measure performance in init_amd_k6()
Pull perf fixes from Ingo Molnar:
"Two hw-enablement patches, two race fixes, three fixes for regressions
of semantics, plus a number of tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Add proper condition to run sched_task callbacks
perf/core: Fix locking for children siblings group read
perf/core: Fix scheduling regression of pinned groups
perf/x86/intel: Fix debug_store reset field for freq events
perf/x86/intel: Add Goldmont Plus CPU PMU support
perf/x86/intel: Enable C-state residency events for Apollo Lake
perf symbols: Accept zero as the kernel base address
Revert "perf/core: Drop kernel samples even though :u is specified"
perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its target
perf evsel: State in the default event name if attr.exclude_kernel is set
perf evsel: Fix attr.exclude_kernel setting for default cycles:p
Pull core fixes from Ingo Molnar:
"A fix to WARN_ON_ONCE() done by modules, plus a MAINTAINERS update"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debug: Fix WARN_ON_ONCE() for modules
MAINTAINERS: Update the PTRACE entry
xtensa's asm/device.h is a verbatim copy of asm-generic/device.h and
does not add any arch specific extensions. Thus, use the asm-generic
header directly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
xtensa's asm/device.h is a verbatim copy of asm-generic/device.h and
does not add any arch specific extensions. Thus, use the asm-generic
header directly.
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each device node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: devicetree@vger.kernel.org
Link: http://lkml.kernel.org/r/20170718214339.7774-7-robh@kernel.org
[ Clarify the error message while at it, as 'node' is ambiguous. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We have 2 functions using the same sched_task callback:
- PEBS drain for free running counters
- LBR save/store
Both of them are called from intel_pmu_sched_task() and
either of them can be unwillingly triggered when the
other one is configured to run.
Let's say there's PEBS drain configured in sched_task
callback for the event, but in the callback itself
(intel_pmu_sched_task()) we will also run the code for
LBR save/restore, which we did not ask for, but the
code in intel_pmu_sched_task() does not check for that.
This can lead to extra cycles in some perf monitoring,
like when we monitor PEBS event without LBR data.
# perf record --no-timestamp -c 10000 -e cycles:p ./perf bench sched pipe -l 1000000
(We need PEBS, non freq/non timestamp event to enable
the sched_task callback)
The perf stat of cycles and msr:write_msr for above
command before the change:
...
Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \
./perf bench sched pipe -l 1000000' (5 runs):
18,519,557,441 cycles:k
91,195,527 msr:write_msr
29.334476406 seconds time elapsed
And after the change:
...
Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \
./perf bench sched pipe -l 1000000' (5 runs):
18,704,973,540 cycles:k
27,184,720 msr:write_msr
16.977875900 seconds time elapsed
There's no affect on cycles:k because the sched_task happens
with events switched off, however the msr:write_msr tracepoint
counter together with almost 50% of time speedup show the
improvement.
Monitoring LBR event and having extra PEBS drain processing
in sched_task callback showed just a little speedup, because
the drain function does not do much extra work in case there
is no PEBS data.
Adding conditions to recognize the configured work that needs
to be done in the x86_pmu's sched_task callback.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170719075247.GA27506@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The BAU confers no benefit to a UV system running with only one hub/socket.
Permanently disable the BAU driver if there are less than two hubs online
to avoid BAU overhead. We have observed failed boots on single-socket UV4
systems caused by BAU that are avoided with this patch.
Also, while at it, consolidate initialization error blocks and fix a
memory leak.
Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Russ Anderson <rja@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: tony.ernst@hpe.com
Link: http://lkml.kernel.org/r/1500588351-78016-1-git-send-email-abanman@hpe.com
[ Minor cleanups. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The libretech CC derives less from the p212 than initially thought.
Several voltage regulators are different and the capabilities of the
sdcard and emmc also differ.
Deriving from the p212 is not convient anymore so the libretech is now
derived from s905x definition directly.
Fixes: cd84aff1d9 ("ARM64: dts: meson-gxl: Add Libre Technology CC support")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Use the specific compatible for AO pwms so the pwms input can
be correctly set
FDIV4 is not present on the pwm A0, so change kadhas vim input
clocks to xtal.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Remove old, dead Kconfig options (in order appearing in this commit):
- EXPERIMENTAL is gone since v3.9;
- INET_LRO: commit 7bbf3cae65 ("ipv4: Remove inet_lro library");
- AUTOFS_FS: commit 561c5cf923 ("staging: Remove autofs3");
- RCU_CPU_STALL_DETECTOR: commit a00e0d714f ("rcu: Remove conditional
compilation for RCU CPU stall warnings");
- USB_DEVICE_CLASS: commit 007bab9132 ("USB: remove
CONFIG_USB_DEVICE_CLASS");
- SYSCTL_SYSCALL_CHECK: commit 7c60c48f58 ("sysctl: Improve the
sysctl sanity checks");
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add hstate for each supported hugepage size using
arch initcall. This change fixes some hugepage
parameter parsing inconsistencies:
case 1: no hugepage parameters
Without hugepage parameters, only a hugepages-8192kB entry is visible
in sysfs. It's different from x86_64 where both 2M and 1G hugepage
sizes are available.
case 2: default_hugepagesz=[64K|256M|2G]
When specifying only a default_hugepagesz parameter, the default
hugepage size isn't really changed and it stays at 8M. This is again
different from x86_64.
Orabug: 25869946
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
They really are, and the "take the address of a single character" makes
the string fortification code unhappy (it believes that you can now only
acccess one byte, rather than a byte range, and then raises errors for
the memory copies going on in there).
We could now remove a few 'addressof' operators (since arrays naturally
degrade to pointers), but this is the minimal patch that just changes
the C prototypes of those template arrays (the templates themselves are
defined in inline asm).
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Acked-and-tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the SynIC timer message delivery fails due to SINT message slot being
busy, there's no point to attempt starting the timer again until we're
notified of the slot being released by the guest (via EOM or EOI).
Even worse, when a oneshot timer fails to deliver its message, its
re-arming with an expiration time in the past leads to immediate retry
of the delivery, and so on, without ever letting the guest vcpu to run
and release the slot, which results in a livelock.
To avoid that, only start the timer when there's no timer message
pending delivery. When there is, meaning the slot is busy, the
processing will be restarted upon notification from the guest that the
slot is released.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
This can be reproduced by EPT=1, unrestricted_guest=N, emulate_invalid_state=Y
or EPT=0, the trace of kvm-unit-tests/taskswitch2.flat is like below, it tries
to emulate invalid guest state task-switch:
kvm_exit: reason TASK_SWITCH rip 0x0 info 40000058 0
kvm_emulate_insn: 42000:0:0f 0b (0x2)
kvm_emulate_insn: 42000:0:0f 0b (0x2) failed
kvm_inj_exception: #UD (0x0)
kvm_entry: vcpu 0
kvm_exit: reason TASK_SWITCH rip 0x0 info 40000058 0
kvm_emulate_insn: 42000:0:0f 0b (0x2)
kvm_emulate_insn: 42000:0:0f 0b (0x2) failed
kvm_inj_exception: #UD (0x0)
......................
It appears that the task-switch emulation updates rflags (and vm86
flag) only after the segments are loaded, causing vmx->emulation_required
to be set, when in fact invalid guest state emulation is not needed.
This patch fixes it by updating vmx->emulation_required after the
rflags (and vm86 flag) is updated in task-switch emulation.
Thanks Radim for moving the update to vmx__set_flags and adding Paolo's
suggestion for the check.
Suggested-by: Nadav Amit <nadav.amit@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The way how default DMA pool is exposed has changed and now we need to
use dedicated interface to work with it. This patch makes alloc/release
operations to use such interface. Since, default DMA pool is not
handled by generic code anymore we have to implement our own mmap
operation.
Tested-by: Andras Szemzo <sza@esh.hu>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Christoph noticed [1] that default DMA pool in current form overload
the DMA coherent infrastructure. In reply, Robin suggested [2] to
split the per-device vs. global pool interfaces, so allocation/release
from default DMA pool is driven by dma ops implementation.
This patch implements Robin's idea and provide interface to
allocate/release/mmap the default (aka global) DMA pool.
To make it clear that existing *_from_coherent routines work on
per-device pool rename them to *_from_dev_coherent.
[1] https://lkml.org/lkml/2017/7/7/370
[2] https://lkml.org/lkml/2017/7/7/431
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When kexec was converted to DTB, the dtb address was passed between
machine_kexec_prepare() and machine_kexec() using a static variable.
This is bad news if you load a crash kernel followed by a normal
kernel or vice versa - the last loaded kernel overwrites the dtb
address.
This can result in kexec failures, as (eg) we try to boot the crash
kernel with the last loaded dtb. For example, with:
the crash kernel fails to find the dtb.
Avoid this by defining a kimage architecture structure, and store
the address to be passed in r2 there, which will either be the ATAGs
or the dtb blob.
Fixes: 4cabd1d962 ("ARM: 7539/1: kexec: scan for dtb magic in segments")
Fixes: 42d720d173 ("ARM: kexec: Make .text R/W in machine_kexec")
Reported-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Allocating the crashkernel region outside lowmem causes the kernel to
oops while trying to kexec into the new kernel:
Loading crashdump kernel...
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = edd70000
[00000000] *pgd=de19e835
Internal error: Oops: 817 [#2] SMP ARM
Modules linked in: ...
CPU: 0 PID: 689 Comm: sh Not tainted 4.12.0-rc3-next-20170601-04015-gc3a5a20
Hardware name: Generic DRA74X (Flattened Device Tree)
task: edb32f00 task.stack: edf18000
PC is at memcpy+0x50/0x330
LR is at 0xe3c34001
pc : [<c04baf30>] lr : [<e3c34001>] psr: 800c0193
sp : edf19c2c ip : 0a000001 fp : c0553170
r10: c055316e r9 : 00000001 r8 : e3130001
r7 : e4903004 r6 : 0a000014 r5 : e3500000 r4 : e59f106c
r3 : e59f0074 r2 : ffffffe8 r1 : c010fb88 r0 : 00000000
Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: add7006a DAC: 00000051
Process sh (pid: 689, stack limit = 0xedf18218)
Stack: (0xedf19c2c to 0xedf1a000)
...
[<c04baf30>] (memcpy) from [<c010fae0>] (machine_kexec+0xa8/0x12c)
[<c010fae0>] (machine_kexec) from [<c01e4104>] (__crash_kexec+0x5c/0x98)
[<c01e4104>] (__crash_kexec) from [<c01e419c>] (crash_kexec+0x5c/0x68)
[<c01e419c>] (crash_kexec) from [<c010c5c0>] (die+0x228/0x490)
[<c010c5c0>] (die) from [<c011e520>] (__do_kernel_fault.part.0+0x54/0x1e4)
[<c011e520>] (__do_kernel_fault.part.0) from [<c082412c>] (do_page_fault+0x1e8/0x400)
[<c082412c>] (do_page_fault) from [<c010135c>] (do_DataAbort+0x38/0xb8)
[<c010135c>] (do_DataAbort) from [<c0823584>] (__dabt_svc+0x64/0xa0)
This is caused by image->control_code_page being a highmem page, so
page_address(image->control_code_page) returns NULL. In any case, we
don't want the control page to be a highmem page.
We already limit the crash kernel region to the top of 32-bit physical
memory space. Also limit it to the top of lowmem in physical space.
Reported-by: Keerthy <j-keerthy@ti.com>
Tested-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Mike Galbraith reported a situation where a WARN_ON_ONCE() call in DRM
code turned into an oops. As it turns out, WARN_ON_ONCE() seems to be
completely broken when called from a module.
The bug was introduced with the following commit:
19d436268d ("debug: Add _ONCE() logic to report_bug()")
That commit changed WARN_ON_ONCE() to move its 'once' logic into the bug
trap handler. It requires a writable bug table so that the BUGFLAG_DONE
bit can be written to the flags to indicate the first warning has
occurred.
The bug table was made writable for vmlinux, which relies on
vmlinux.lds.S and vmlinux.lds.h for laying out the sections. However,
it wasn't made writable for modules, which rely on the ELF section
header flags.
Reported-by: Mike Galbraith <efault@gmx.de>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 19d436268d ("debug: Add _ONCE() logic to report_bug()")
Link: http://lkml.kernel.org/r/a53b04235a65478dd9afc51f5b329fdc65c84364.1500095401.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
In current die(), the irq is disabled for __die() handle, not
including the possible panic() handling. Since the log in __die()
can take several hundreds ms, new irq might come and interrupt
current die().
If the process calling die() holds some critical resource, and some
other process scheduled later also needs it, then it would deadlock.
The first panic will not be executed.
So here disable irq for the whole flow of die().
Signed-off-by: Qiao Zhou <qiaozhou@asrmicro.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The lse implementation of atomic64_dec_if_positive uses the '+&' constraint,
but the '&' is redundant and confusing in this case, since early clobber
on a read/write operand is a strange concept.
Replace the constraint with '+'.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Casting a pointer to an integral type doesn't require a __force
attribute, because you'll need to cast back to a pointer in order to
dereference the thing anyway.
This patch removes the redundant __force cast from __range_ok.
Reported-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We have space for exactly three characters for the index in "max7315_%d_base",
but as GCC points out having more would cause an string overflow:
arch/x86/platform/intel-mid/device_libs/platform_max7315.c: In function 'max7315_platform_data':
arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: error: '%d' directive writing between 1 and 11 bytes into a region of size 9 [-Werror=format-overflow=]
sprintf(base_pin_name, "max7315_%d_base", nr);
^~~~~~~~~~~~~~~~~
arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: note: directive argument in the range [-2147483647, 2147483647]
arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:3: note: 'sprintf' output between 15 and 25 bytes into a destination of size 17
sprintf(base_pin_name, "max7315_%d_base", nr);
This makes it use an snprintf() to truncate the string if that happened
rather than overflowing the stack. In practice, this is safe, because
there won't be a large number of max7315 devices in the systems, and
both the format and the length are defined by the firmware interface.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170719125310.2487451-9-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The IOSF_MBI option requires PCI support, without it we get a harmless
Kconfig warning when it gets selected by PUNIT_ATOM_DEBUG:
warning: (X86_INTEL_LPSS && SND_SST_IPC_ACPI && MMC_SDHCI_ACPI && PUNIT_ATOM_DEBUG) selects IOSF_MBI which has unmet direct dependencies (PCI)
This adds another dependency to avoid the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170719125310.2487451-8-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Every kernel build on x86 will result in some output:
Setup is 13084 bytes (padded to 13312 bytes).
System is 4833 kB
CRC 6d35fa35
Kernel: arch/x86/boot/bzImage is ready (#2)
This shuts it up, so that 'make -s' is truely silent as long as
everything works. Building without '-s' should produce unchanged
output.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170719125310.2487451-6-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The x86 version of insb/insw/insl uses an inline assembly that does
not have the target buffer listed as an output. This can confuse
the compiler, leading it to think that a subsequent access of the
buffer is uninitialized:
drivers/net/wireless/wl3501_cs.c: In function ‘wl3501_mgmt_scan_confirm’:
drivers/net/wireless/wl3501_cs.c:665:9: error: ‘sig.status’ is used uninitialized in this function [-Werror=uninitialized]
drivers/net/wireless/wl3501_cs.c:668:12: error: ‘sig.cap_info’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/net/sb1000.c: In function 'sb1000_rx':
drivers/net/sb1000.c:775:9: error: 'st[0]' is used uninitialized in this function [-Werror=uninitialized]
drivers/net/sb1000.c:776:10: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/net/sb1000.c:784:11: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
I tried to mark the exact input buffer as an output here, but couldn't
figure it out. As suggested by Linus, marking all memory as clobbered
however is good enough too. For the outs operations, I also add the
memory clobber, to force the input to be written to local variables.
This is probably already guaranteed by the "asm volatile", but it can't
hurt to do this for symmetry.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Link: http://lkml.kernel.org/r/20170719125310.2487451-5-arnd@arndb.de
Link: https://lkml.org/lkml/2017/7/12/605
Signed-off-by: Ingo Molnar <mingo@kernel.org>
gcc-7.1.1 produces this warning:
arch/x86/math-emu/reg_add_sub.c: In function 'FPU_add':
arch/x86/math-emu/reg_add_sub.c:80:48: error: ?: using integer constants in boolean context [-Werror=int-in-bool-context]
This appears to be a bug in gcc-7.1.1, and I have reported it as
PR81484. The compiler suggests that code written as
if (a & b ? c : d)
is usually incorrect and should have been
if (a & (b ? c : d))
However, in this case, we correctly write
if ((a & b) ? c : d)
and should not get a warning for it.
This adds a dirty workaround for the problem, adding a comparison with
zero inside of the macro. The warning is currently disabled in the kernel,
so we may decide not to apply the patch, and instead wait for future gcc
releases to fix the problem. On the other hand, it seems to be the
only instance of this particular problem.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Bill Metzenthen <billm@melbpc.org.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170719125310.2487451-4-arnd@arndb.de
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When building the kernel with "make EXTRA_CFLAGS=...", this overrides
the "PARANOID" preprocessor macro defined in arch/x86/math-emu/Makefile,
and we run into a build warning:
arch/x86/math-emu/reg_compare.c: In function ‘compare_i_st_st’:
arch/x86/math-emu/reg_compare.c:254:6: error: ‘f’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
This fixes the implementation to work correctly even without the PARANOID
flag, and also fixes the Makefile to not use the EXTRA_CFLAGS variable
but instead use the ccflags-y variable in the Makefile that is meant
for this purpose.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Bill Metzenthen <billm@melbpc.org.au>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170719125310.2487451-3-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The intialization function checks for various failure scenarios, but
unfortunately the compiler gets a little confused about the possible
combinations, leading to a false-positive build warning when
-Wmaybe-uninitialized is set:
arch/x86/events/core.c: In function ‘init_hw_perf_events’:
arch/x86/events/core.c:264:3: warning: ‘reg_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized]
arch/x86/events/core.c:264:3: warning: ‘val_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized]
pr_err(FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n",
We can't actually run into this case, so this shuts up the warning
by initializing the variables to a known-invalid state.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170719125310.2487451-2-arnd@arndb.de
Link: https://patchwork.kernel.org/patch/9392595/
Signed-off-by: Ingo Molnar <mingo@kernel.org>
One of the rarely executed code pathes in check_timer() calls
unmask_ioapic_irq() passing irq_get_chip_data(0) as argument.
That's wrong as unmask_ioapic_irq() expects a pointer to the irq data of
interrupt 0. irq_get_chip_data(0) returns NULL, so the following
dereference in unmask_ioapic_irq() causes a kernel panic.
The issue went unnoticed in the first place because irq_get_chip_data()
returns a void pointer so the compiler cannot do a type check on the
argument. The code path was added for machines with broken configuration,
but it seems that those machines are either not running current kernels or
simply do not longer exist.
Hand in irq_get_irq_data(0) as argument which provides the correct data.
[ tglx: Rewrote changelog ]
Fixes: 4467715a44 ("x86/irq: Move irq_cfg.irq_2_pin into io_apic.c")
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1500369644-45767-1-git-send-email-kkamagui@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The bus_irq argument of mp_override_legacy_irq() is used as the index into
the isa_irq_to_gsi[] array. The bus_irq argument originates from
ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI
tables, but is nowhere sanity checked.
That allows broken or malicious ACPI tables to overwrite memory, which
might cause malfunction, panic or arbitrary code execution.
Add a sanity check and emit a warning when that triggers.
[ tglx: Added warning and rewrote changelog ]
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: security@kernel.org
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
randstruct plugin, including the task_struct.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Kees Cook <kees@outflux.net>
iQIcBAABCgAGBQJZbRgGAAoJEIly9N/cbcAmk2AQAIL60aQ+9RIcFAXriFhnd7Z2
x9Jqi9JNc8NgPFXx8GhE4J4eTZ5PwcjgXBpNRWY/laBkRyoBHn24ku09YxrJjmHz
ZSUsP+/iO9lVeEfbmU9Tnk50afkfwx6bHXBwkiVGQWHtybNVUqA19JbqkHeg8ubx
myKLGeUv5PPCodRIcBDD0+HaAANcsqtgbDpgmWU8s+IXWwvWCE2p7PuBw7v3HHgH
qzlPDHYQCRDw+LWsSqPaHj+9mbRO18P/ydMoZHGH4Hl3YYNtty8ZbxnraI3A7zBL
6mLUVcZ+/l88DqHc5I05T8MmLU1yl2VRxi8/jpMAkg9wkvZ5iNAtlEKIWU6eqsvk
vaImNOkViLKlWKF+oUD1YdG16d8Segrc6m4MGdI021tb+LoGuUbkY7Tl4ee+3dl/
9FM+jPv95HjJnyfRNGidh2TKTa9KJkh6DYM9aUnktMFy3ca1h/LuszOiN0LTDiHt
k5xoFURk98XslJJyXM8FPwXCXiRivrXMZbg5ixNoS4aYSBLv7Cn1M6cPnSOs7UPh
FqdNPXLRZ+vabSxvEg5+41Ioe0SHqACQIfaSsV5BfF2rrRRdaAxK4h7DBcI6owV2
7ziBN1nBBq2onYGbARN6ApyCqLcchsKtQfiZ0iFsvW7ZawnkVOOObDTCgPl3tdkr
403YXzphQVzJtpT5eRV6
=ngAW
-----END PGP SIGNATURE-----
Merge tag 'gcc-plugins-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull structure randomization updates from Kees Cook:
"Now that IPC and other changes have landed, enable manual markings for
randstruct plugin, including the task_struct.
This is the rest of what was staged in -next for the gcc-plugins, and
comes in three patches, largest first:
- mark "easy" structs with __randomize_layout
- mark task_struct with an optional anonymous struct to isolate the
__randomize_layout section
- mark structs to opt _out_ of automated marking (which will come
later)
And, FWIW, this continues to pass allmodconfig (normal and patched to
enable gcc-plugins) builds of x86_64, i386, arm64, arm, powerpc, and
s390 for me"
* tag 'gcc-plugins-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
randstruct: opt-out externally exposed function pointer structs
task_struct: Allow randomized layout
randstruct: Mark various structs for randomization
KVM tries to select 'TASKSTATS', which had additional dependencies:
warning: (KVM) selects TASKSTATS which has unmet direct dependencies (NET && MULTIUSER)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Immediately following MOV-to-SS/POP-to-SS, VM-entry is
disallowed. This check comes after the check for a valid VMCS. When
this check fails, the instruction pointer should fall through to the
next instruction, the ALU flags should be set to indicate VMfailValid,
and the VM-instruction error should be set to 26 ("VM entry with
events blocked by MOV SS").
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
vmx_recover_nmi_blocking is using a cached value of the guest
interruptibility info, which is stored in vmx->nmi_known_unmasked.
vmx_recover_nmi_blocking is run for both normal and nested guests,
so the cached value must be per-VMCS.
This fixes eventinj.flat in a nested non-EPT environment. With EPT it
works, because the EPT violation handler doesn't have the
vmx->nmi_known_unmasked optimization (it is unnecessary because, unlike
vmx_recover_nmi_blocking, it can just look at the exit qualification).
Thanks to Wanpeng Li for debugging the testcase and providing an initial
patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The crypto engines found on the cp110 master and slave are dma coherent.
This patch adds the relevant property to their dt nodes.
Cc: stable@vger.kernel.org # v4.12+
Fixes: 973020fd94 ("arm64: marvell: dts: add crypto engine description for 7k/8k")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
kvm_read_cr3() returns an unsigned long and gfn is a u64. We intended
to mask out the bottom 5 bits but because of the type issue we mask the
top 32 bits as well. I don't know if this is a real problem, but it
causes static checker warnings.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Fix a build error caused by not including <linux/bug.h>.
The following compilation errors are caused by the missing header:
arch/mips/ralink/mt7620.c: In function ‘mt7620_get_cpu_pll_rate’:
arch/mips/ralink/mt7620.c:431:2: error: implicit declaration of function ‘WARN_ON’ [-Werror=implicit-function-declaration]
WARN_ON(div >= ARRAY_SIZE(mt7620_clk_divider));
^
arch/mips/ralink/mt7620.c: In function ‘mt7620_get_sys_rate’:
arch/mips/ralink/mt7620.c:500:2: error: implicit declaration of function ‘WARN’ [-Werror=implicit-function-declaration]
if (WARN(!div, "invalid divider for OCP ratio %u", ocp_ratio))
^
arch/mips/ralink/mt7620.c: In function ‘mt7620_dram_init’:
arch/mips/ralink/mt7620.c:619:3: error: implicit declaration of function ‘BUG’ [-Werror=implicit-function-declaration]
BUG();
^
cc1: some warnings being treated as errors
scripts/Makefile.build:302: recipe for target 'arch/mips/ralink/mt7620.o' failed
Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16781/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Previously, <linux/module.h> was included before ralink_regs.h in all
ralink files - leading to <linux/io.h> being implicitly included.
After commit 26dd3e4ff9 ("MIPS: Audit and remove any unnecessary
uses of module.h") removed the inclusion of module.h from multiple
places, some ralink platforms failed to build with the following error:
In file included from arch/mips/ralink/mt7620.c:17:0:
./arch/mips/include/asm/mach-ralink/ralink_regs.h: In function ‘rt_sysc_w32’:
./arch/mips/include/asm/mach-ralink/ralink_regs.h:38:2: error: implicit declaration of function ‘__raw_writel’ [-Werror=implicit-function-declaration]
__raw_writel(val, rt_sysc_membase + reg);
^
./arch/mips/include/asm/mach-ralink/ralink_regs.h: In function ‘rt_sysc_r32’:
./arch/mips/include/asm/mach-ralink/ralink_regs.h:43:2: error: implicit declaration of function ‘__raw_readl’ [-Werror=implicit-function-declaration]
return __raw_readl(rt_sysc_membase + reg);
Fix this by including <linux/io.h>.
Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com>
Fixes: 26dd3e4ff9 ("MIPS: Audit and remove any unnecessary uses of module.h")
Cc: John Crispin <john@phrozen.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org> #4.11+
Patchwork: https://patchwork.linux-mips.org/patch/16780/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This fixes another cause of random segfaults and bus errors that may
occur while running perf with the callgraph option.
Critical sections beginning with spin_lock_irqsave() raise the interrupt
level to PIL_NORMAL_MAX (14) and intentionally do not block performance
counter interrupts, which arrive at PIL_NMI (15).
But some sections of code are "super critical" with respect to perf
because the perf_callchain_user() path accesses user space and may cause
TLB activity as well as faults as it unwinds the user stack.
One particular critical section occurs in switch_mm:
spin_lock_irqsave(&mm->context.lock, flags);
...
load_secondary_context(mm);
tsb_context_switch(mm);
...
spin_unlock_irqrestore(&mm->context.lock, flags);
If a perf interrupt arrives in between load_secondary_context() and
tsb_context_switch(), then perf_callchain_user() could execute with
the context ID of one process, but with an active TSB for a different
process. When the user stack is accessed, it is very likely to
incur a TLB miss, since the h/w context ID has been changed. The TLB
will then be reloaded with a translation from the TSB for one process,
but using a context ID for another process. This exposes memory from
one process to another, and since it is a mapping for stack memory,
this usually causes the new process to crash quickly.
This super critical section needs more protection than is provided
by spin_lock_irqsave() since perf interrupts must not be allowed in.
Since __tsb_context_switch already goes through the trouble of
disabling interrupts completely, we fix this by moving the secondary
context load down into this better protected region.
Orabug: 25577560
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the conversion of the Marvell CP110 Device Tree description from
using GIC interrupts to using ICU interrupts was done, the RTC on the
slave CP110 was left unchanged. This commit fixes that, so that all
devices on the CP properly get their interrupt through the ICU.
Fixes: 6ef84a827c ("arm64: dts: marvell: enable GICP and ICU on Armada 7K/8K")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
There's a bug in PEBs event enabling code, that prevents PEBS
freq events to work properly after non freq PEBS event was run.
freq events - perf_event_attr::freq set
-F <freq> option of perf record
PEBS events - perf_event_attr::precise_ip > 0
default for perf record
Like in following example with CPU 0 busy, we expect ~10000 samples
for following perf tool run:
# perf record -F 10000 -C 0 sleep 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.640 MB perf.data (10031 samples) ]
Everything's fine, but once we run non freq PEBS event like:
# perf record -c 10000 -C 0 sleep 1
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 1.053 MB perf.data (20061 samples) ]
the freq events start to fail like this:
# perf record -F 10000 -C 0 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.185 MB perf.data (40 samples) ]
The issue is in non freq PEBs event initialization of debug_store reset
field, which value is used to auto-reload the counter value after PEBS
event drain. This value is not being used for PEBS freq events, but once
we run non freq event it stays in debug_store data and screws the
sample_freq counting for PEBS freq events.
Setting the reset field to 0 for freq events.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170714163551.19459-1-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add perf core PMU support for Intel Goldmont Plus CPU cores:
- The init code is based on Goldmont.
- There is a new cache event list, based on the Goldmont cache event
list.
- All four general-purpose performance counters support PEBS.
- The first general-purpose performance counter is for reduced skid
PEBS mechanism. Using :ppp to indicate the event which want to do
reduced skid PEBS.
- Goldmont Plus has 4-wide pipeline for Topdown
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Link: http://lkml.kernel.org/r/20170712134423.17766-1-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Goldmont microarchitecture supports C1/C3/C6, PC2/PC3/PC6/PC10 state
residency counters, the patch enables them for Apollo Lake platform.
The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592, Table 2-6 and 2-12.
Signed-off-by: Harry Pan <harry.pan@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: bp@suse.de
Cc: davidcc@google.com
Cc: gs0622@gmail.com
Cc: lukasz.odzioba@intel.com
Cc: piotr.luc@intel.com
Cc: srinivas.pandruvada@linux.intel.com
Link: http://lkml.kernel.org/r/20170717103749.24337-1-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently even with STRICT_KERNEL_RWX we leave the __init text marked
executable after init, which is bad.
Add a hook to mark it NX (no-execute) before we free it, and implement
it for radix and hash.
Note that we use __init_end as the end address, not _einittext,
because overlaps_kernel_text() uses __init_end, because there are
additional executable sections other than .init.text between
__init_begin and __init_end.
Tested on radix and hash with:
0:mon> p $__init_begin
*** 400 exception occurred
Fixes: 1e0fc9d1eb ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
As written in the datasheet the PCA955 can only handle low level irq and
not edge irq.
Without this fix the interrupt is not usable for pca955: the gpio-pca953x
driver already set the irq type as low level which is incompatible with
edge type, then the kernel prevents using the interrupt:
"irq: type mismatch, failed to map hwirq-18 for
/soc/internal-regs/gpio@18100!"
Fixes: 928413bd85 ("ARM: mvebu: Add Armada 388 General Purpose
Development Board support")
Cc: stable@vger.kernel.org
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
As we already did for Armada XP switch from virt_to_phys() to
__pa_symbol().
The reason for it was well explained by Mark Rutland so let's quote him:
"virt_to_phys() is intended to operate on the linear/direct mapping of
RAM.
__pa_symbol() is intended to operate on the kernel mapping, which may
not be in the linear/direct mapping on all architectures. e.g. arm64 and
x86_64 map the kernel image and RAM separately.
On 32-bit ARM the kernel image mapping is tied to the linear/direct
mapping, so that works, but as it's semantically wrong (and broken for
generic code), the DEBUG_VIRTUAL checks complain."
Fixes: db88977894 ("arm: mvebu: support for SMP on 98DX3336 SoC")
Cc: <stable@vger.kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Move the core logic into a helper, so we can use it for changing other
permissions.
We also change the logic to align start down, and end up. This means
calling the function with a range will expand that range to be at
least 1 mmu_linear_psize page in size. We need that so we can use it
on __init_begin ... __init_end which is not a full page in size.
This should always work for _stext/__init_begin, because we align
__init_begin to _stext + 16M in the linker script.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Move the core logic into a helper, so we can use it for changing permissions
other than _PAGE_WRITE.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
A recent commit:
d6e41f1151 ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant")
introduced a VM_WARN_ON(!in_atomic()) which generates false positives
on every VM entry on !CONFIG_PREEMPT_COUNT kernels.
Replace it with a test for preemptible(), which appears to match the
original intent and works across different CONFIG_PREEMPT* variations.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: linux-mm@kvack.org
Fixes: d6e41f1151 ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A previous optimisation incorrectly assumed the PAPR hcall does
not use r12, and clobbers it upon entry. In fact it is used as
an input. This can result in KVM guests crashing (observed with
PR KVM).
Instead of using r12 to save r13, tihs patch saves r13 in ctr.
This is more costly, but not as slow as using the SPRG.
Fixes: acd7d8cef0 ("powerpc/64s: Optimize hypercall/syscall entry")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Remove old, dead Kconfig option INET_LRO. It is gone since
commit 7bbf3cae65 ("ipv4: Remove inet_lro library").
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
POWER9 DD2 can see spurious PMU interrupts after state-loss idle in
some conditions.
A solution is to save and reload MMCR0 over state-loss idle.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull sparc fixes from David Miller:
- Fix DMA regression in 4.13 merge window, only certain chips can do
64-bit DMA. From Dave Dushar.
- Correct cpu cross-call algorithm to correctly detect stalled or stuck
remote cpus, from Jane Chu.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Measure receiver forward progress to avoid send mondo timeout
SPARC64: Fix sun4v DMA panic
Several variables had their types changed from unsigned long to u32,
but the printk()-style format to print them wasn't updated, leading to:
arch/blackfin/kernel/flat.c: In function 'bfin_get_addr_from_rp':
arch/blackfin/kernel/flat.c:35:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'u32' [-Wformat]
arch/blackfin/kernel/flat.c: In function 'bfin_put_addr_at_rp':
arch/blackfin/kernel/flat.c:80:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'u32' [-Wformat]
Fixes: 468138d785 ("binfmt_flat: flat_{get,put}_addr_from_rp() should be able to fail")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag
because it doesn't copy anything from/to userspace to access the
argument.
Fixes: 54ebbfb160 ("tty: add TIOCGPTPEER ioctl")
Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Acked-by: Aleksa Sarai <asarai@suse.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Most platforms using OMAP hsmmc driver have switched to device tree
for passing platform data to omap_hsmmc.c driver.
The hsmmc.c file in mach-omap2 exists only to support pandora board
which uses wl1251 driver in legacy platform data mode.
Hence, remove the dead code not used by the pandora board.
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In commit 1c0eaf0f56 ("powerpc/powernv: Tell OPAL about our MMU mode
on POWER9"), we added additional flags to the OPAL call to configure
CPUs at boot.
These flags only work on Power9 firmwares, and worse can cause boot
failures on Power8 machines, so we check for CPU_FTR_ARCH_300 (aka POWER9)
before adding the extra flags.
Unfortunately we forgot that opal_configure_cores() is called before
the CPU feature checks are dynamically patched, meaning the check
always returns true.
We definitely need to do something to make the CPU feature checks less
prone to bugs like this, but for now the minimal fix is to use
early_cpu_has_feature().
Reported-and-tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Fixes: 1c0eaf0f56 ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
A new compatible string has been introduced for sama5d2 SMC to allow to
manage the registers mapping change.
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
In file included from include/linux/flat.h:13:0,
from fs/binfmt_flat.c:36:
arch/h8300/include/asm/flat.h: In function 'flat_get_addr_from_rp':
arch/h8300/include/asm/flat.h:28:3: error: expected ')' before 'val'
val &= 0x00ffffff;
^
arch/h8300/include/asm/flat.h:31:1: error: expected expression before '}' token
}
^
In file included from include/linux/flat.h:13:0,
from fs/binfmt_flat.c:36:
arch/h8300/include/asm/flat.h:26:6: warning: unused variable 'val' [-Wunused-variable]
u32 val = get_unaligned((__force u32 *)rp);
^
In file included from include/linux/flat.h:13:0,
from fs/binfmt_flat.c:36:
arch/h8300/include/asm/flat.h:31:1: warning: no return statement in function returning non-void [-Wreturn-type]
}
^
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 468138d785 ("binfmt_flat: flat_{get,put}_addr_from_rp() should be able to fail")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several variables had their types changed from unsigned long to u32, but
the arch-specific implementations of flat_set_persistent() weren't
updated, leading to compiler warnings on blackfin and m68k:
fs/binfmt_flat.c: In function ‘load_flat_file’:
fs/binfmt_flat.c:799: warning: passing argument 2 of ‘flat_set_persistent’ from incompatible pointer type
Fixes: 468138d785 ("binfmt_flat: flat_{get,put}_addr_from_rp() should be able to fail")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The binding specifies the actual implementations only (mali-t760
for example) but not the arm,mali-midgard used in some vendor kernels.
So drop that compatible property from the rk3288 where it had slipped in.
Also fix the node name which should be a generic gpu@...
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Bug fix for the BAU tunable congested_cycles not being set to the user
defined value. Instead of referencing a global variable when deciding
on BAU shutdown, a node will reference its own tunable set
value ( cong_response_us). This results in the user set
tunable value congested_response_us taking effect correctly.
Signed-off-by: Justin Ernst <justin.ernst@hpe.com>
Acked-by: Andrew Banman <abanman@hpe.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: mike.travis@hpe.com
Cc: sivanich@hpe.com
Link: http://lkml.kernel.org/r/1499970803-282432-1-git-send-email-justin.ernst@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This old piece of code is supposed to measure the performance of indirect
calls to determine if the processor is buggy or not, however the compiler
optimizer turns it into a direct call.
Use the OPTIMIZER_HIDE_VAR() macro to thwart the optimization, so that a real
indirect call is generated.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707110737530.8746@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
pinctrl_spi4 pin group is not part of the low power iomux controller,
so move it under the normal iomuxc node.
Fixes: 184f39b57c ("ARM: dts: imx7d-sdb: Add GPIO expander node")
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Pull ->s_options removal from Al Viro:
"Preparations for fsmount/fsopen stuff (coming next cycle). Everything
gets moved to explicit ->show_options(), killing ->s_options off +
some cosmetic bits around fs/namespace.c and friends. Basically, the
stuff needed to work with fsmount series with minimum of conflicts
with other work.
It's not strictly required for this merge window, but it would reduce
the PITA during the coming cycle, so it would be nice to have those
bits and pieces out of the way"
* 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
isofs: Fix isofs_show_options()
VFS: Kill off s_options and helpers
orangefs: Implement show_options
9p: Implement show_options
isofs: Implement show_options
afs: Implement show_options
affs: Implement show_options
befs: Implement show_options
spufs: Implement show_options
bpf: Implement show_options
ramfs: Implement show_options
pstore: Implement show_options
omfs: Implement show_options
hugetlbfs: Implement show_options
VFS: Don't use save/replace_mount_options if not using generic_show_options
VFS: Provide empty name qstr
VFS: Make get_filesystem() return the affected filesystem
VFS: Clean up whitespace in fs/namespace.c and fs/super.c
Provide a function to create a NUL-terminated string from unterminated data
Pull uacess-unaligned removal from Al Viro:
"That stuff had just one user, and an exotic one, at that - binfmt_flat
on arm and m68k"
* 'work.uaccess-unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
kill {__,}{get,put}_user_unaligned()
binfmt_flat: flat_{get,put}_addr_from_rp() should be able to fail
Pull MIPS updates from Ralf Baechle:
"Boston platform support:
- Document DT bindings
- Add CLK driver for board clocks
CM:
- Avoid per-core locking with CM3 & higher
- WARN on attempt to lock invalid VP, not BUG
CPS:
- Select CONFIG_SYS_SUPPORTS_SCHED_SMT for MIPSr6
- Prevent multi-core with dcache aliasing
- Handle cores not powering down more gracefully
- Handle spurious VP starts more gracefully
DSP:
- Add lwx & lhx missaligned access support
eBPF:
- Add MIPS support along with many supporting change to add the
required infrastructure
Generic arch code:
- Misc sysmips MIPS_ATOMIC_SET fixes
- Drop duplicate HAVE_SYSCALL_TRACEPOINTS
- Negate error syscall return in trace
- Correct forced syscall errors
- Traced negative syscalls should return -ENOSYS
- Allow samples/bpf/tracex5 to access syscall arguments for sane
traces
- Cleanup from old Kconfig options in defconfigs
- Fix PREF instruction usage by memcpy for MIPS R6
- Fix various special cases in the FPU eulation
- Fix some special cases in MIPS16e2 support
- Fix MIPS I ISA /proc/cpuinfo reporting
- Sort MIPS Kconfig alphabetically
- Fix minimum alignment requirement of IRQ stack as required by
ABI / GCC
- Fix special cases in the module loader
- Perform post-DMA cache flushes on systems with MAARs
- Probe the I6500 CPU
- Cleanup cmpxchg and add support for 1 and 2 byte operations
- Use queued read/write locks (qrwlock)
- Use queued spinlocks (qspinlock)
- Add CPU shared FTLB feature detection
- Handle tlbex-tlbp race condition
- Allow storing pgd in C0_CONTEXT for MIPSr6
- Use current_cpu_type() in m4kc_tlbp_war()
- Support Boston in the generic kernel
Generic platform:
- yamon-dt: Pull YAMON DT shim code out of SEAD-3 board
- yamon-dt: Support > 256MB of RAM
- yamon-dt: Use serial* rather than uart* aliases
- Abstract FDT fixup application
- Set RTC_ALWAYS_BCD to 0
- Add a MAINTAINERS entry
core kernel:
- qspinlock.c: include linux/prefetch.h
Loongson 3:
- Add support
Perf:
- Add I6500 support
SEAD-3:
- Remove GIC timer from DT
- Set interrupt-parent per-device, not at root node
- Fix GIC interrupt specifiers
SMP:
- Skip IPI setup if we only have a single CPU
VDSO:
- Make comment match reality
- Improvements to time code in VDSO"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (86 commits)
locking/qspinlock: Include linux/prefetch.h
MIPS: Fix MIPS I ISA /proc/cpuinfo reporting
MIPS: Fix minimum alignment requirement of IRQ stack
MIPS: generic: Support MIPS Boston development boards
MIPS: DTS: img: Don't attempt to build-in all .dtb files
clk: boston: Add a driver for MIPS Boston board clocks
dt-bindings: Document img,boston-clock binding
MIPS: Traced negative syscalls should return -ENOSYS
MIPS: Correct forced syscall errors
MIPS: Negate error syscall return in trace
MIPS: Drop duplicate HAVE_SYSCALL_TRACEPOINTS select
MIPS16e2: Provide feature overrides for non-MIPS16 systems
MIPS: MIPS16e2: Report ASE presence in /proc/cpuinfo
MIPS: MIPS16e2: Subdecode extended LWSP/SWSP instructions
MIPS: MIPS16e2: Identify ASE presence
MIPS: VDSO: Fix a mismatch between comment and preprocessor constant
MIPS: VDSO: Add implementation of gettimeofday() fallback
MIPS: VDSO: Add implementation of clock_gettime() fallback
MIPS: VDSO: Fix conversions in do_monotonic()/do_monotonic_coarse()
MIPS: Use current_cpu_type() in m4kc_tlbp_war()
...
Pull UML updates from Richard Weinberger:
"Mostly fixes for UML:
- First round of fixes for PTRACE_GETRESET/SETREGSET
- A printf vs printk cleanup
- Minor improvements"
* 'for-linus-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Correctly check for PTRACE_GETRESET/SETREGSET
um: v2: Use generic NOTES macro
um: Add kerneldoc for userspace_tramp() and start_userspace()
um: Add kerneldoc for segv_handler
um: stub-data.h: remove superfluous include
um: userspace - be more verbose in ptrace set regs error
um: add dummy ioremap and iounmap functions
um: Allow building and running on older hosts
um: Avoid longjmp/setjmp symbol clashes with libpthread.a
um: console: Ignore console= option
um: Use os_warn to print out pre-boot warning/error messages
um: Add os_warn() for pre-boot warning/error messages
um: Use os_info for the messages on normal path
um: Add os_info() for pre-boot information messages
um: Use printk instead of printf in make_uml_dir
Common:
- add uevents for VM creation/destruction
- annotate and properly access RCU-protected objects
s390:
- rename IOCTL added in the first v4.13 merge
x86:
- emulate VMLOAD VMSAVE feature in SVM
- support paravirtual asynchronous page fault while nested
- add Hyper-V userspace interfaces for better migration
- improve master clock corner cases
- extend internal error reporting after EPT misconfig
- correct single-stepping of emulated instructions in SVM
- handle MCE during VM entry
- fix nVMX VM entry checks and nVMX VMCS shadowing
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJZaOm6AAoJEED/6hsPKofoqO8H/3breVIyVv9mwg7A5+o+6LTq
GzV/YXHSC8NtfxZn8ViS/TCziYiBSFv7XiPSodkXbOgYSz8Yya5x9D+dbEH+xgG7
l+LsZEqdSFbHCkvKrMiwSsoXtsT5WygA56+KZiBmu8cvlwqSyXWHFn3ZJ1wKzGq/
zivlkfCoh2m6bGdNmrG9pHUSgxvDh94pXesaVBKy4hgeovY1qjzby3Lo+HuIUzai
exuEU1EKRlUIfLK1B2Anp5IIv5Q1lFnMSvD6YSiWYywZb95dN/adsX1bv+MKeOdt
TIAgotsWjaAuT9JolAJjfVPHG0+uMBMsWg4Zh9Ra/gPPaSh3KEC2h1++zEYKjvw=
=1zII
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.13-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Radim Krčmář:
"Second batch of KVM updates for v4.13
Common:
- add uevents for VM creation/destruction
- annotate and properly access RCU-protected objects
s390:
- rename IOCTL added in the first v4.13 merge
x86:
- emulate VMLOAD VMSAVE feature in SVM
- support paravirtual asynchronous page fault while nested
- add Hyper-V userspace interfaces for better migration
- improve master clock corner cases
- extend internal error reporting after EPT misconfig
- correct single-stepping of emulated instructions in SVM
- handle MCE during VM entry
- fix nVMX VM entry checks and nVMX VMCS shadowing"
* tag 'kvm-4.13-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
kvm: x86: hyperv: make VP_INDEX managed by userspace
KVM: async_pf: Let guest support delivery of async_pf from guest mode
KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf
KVM: async_pf: Add L1 guest async_pf #PF vmexit handler
KVM: x86: Simplify kvm_x86_ops->queue_exception parameter list
kvm: x86: hyperv: add KVM_CAP_HYPERV_SYNIC2
KVM: x86: make backwards_tsc_observed a per-VM variable
KVM: trigger uevents when creating or destroying a VM
KVM: SVM: Enable Virtual VMLOAD VMSAVE feature
KVM: SVM: Add Virtual VMLOAD VMSAVE feature definition
KVM: SVM: Rename lbr_ctl field in the vmcb control area
KVM: SVM: Prepare for new bit definition in lbr_ctl
KVM: SVM: handle singlestep exception when skipping emulated instructions
KVM: x86: take slots_lock in kvm_free_pit
KVM: s390: Fix KVM_S390_GET_CMMA_BITS ioctl definition
kvm: vmx: Properly handle machine check during VM-entry
KVM: x86: update master clock before computing kvmclock_offset
kvm: nVMX: Shadow "high" parts of shadowed 64-bit VMCS fields
kvm: nVMX: Fix nested_vmx_check_msr_bitmap_controls
kvm: nVMX: Validate the I/O bitmaps on nested VM-entry
...
Pull crypto fixes from Herbert Xu:
- fix new compiler warnings in cavium
- set post-op IV properly in caam (this fixes chaining)
- fix potential use-after-free in atmel in case of EBUSY
- fix sleeping in softirq path in chcr
- disable buggy sha1-avx2 driver (may overread and page fault)
- fix use-after-free on signals in caam
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: cavium - make several functions static
crypto: chcr - Avoid algo allocation in softirq.
crypto: caam - properly set IV after {en,de}crypt
crypto: atmel - only treat EBUSY as transient if backlog
crypto: af_alg - Avoid sock_graft call warning
crypto: caam - fix signals handling
crypto: sha1-ssse3 - Disable avx2
Merge even more updates from Andrew Morton:
- a few leftovers
- fault-injector rework
- add a module loader test driver
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kmod: throttle kmod thread limit
kmod: add test driver to stress test the module loader
MAINTAINERS: give kmod some maintainer love
xtensa: use generic fb.h
fault-inject: add /proc/<pid>/fail-nth
fault-inject: simplify access check for fail-nth
fault-inject: make fail-nth read/write interface symmetric
fault-inject: parse as natural 1-based value for fail-nth write interface
fault-inject: automatically detect the number base for fail-nth write interface
kernel/watchdog.c: use better pr_fmt prefix
MAINTAINERS: move the befs tree to kernel.org
lib/atomic64_test.c: add a test that atomic64_inc_not_zero() returns an int
mm: fix overflow check in expand_upwards()
Pull arch/tile updates from Chris Metcalf:
"This adds support for an <arch/intreg.h> to help with removing
__need_xxx #defines from glibc, and removes some dead code in
arch/tile/mm/init.c"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
mm, tile: drop arch_{add,remove}_memory
tile: prefer <arch/intreg.h> to __need_int_reg_t
Nothing that really stands out, just a bunch of fixes that have come in in the
last couple of weeks.
None of these are actually fixes for code that is new in 4.13. It's roughly half
older bugs, with fixes going to stable, and half fixes/updates for Power9.
Thanks to:
Aneesh Kumar K.V, Anton Blanchard, Balbir Singh, Benjamin Herrenschmidt,
Madhavan Srinivasan, Michael Neuling, Nicholas Piggin, Oliver O'Halloran.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZaJKuAAoJEFHr6jzI4aWAojsQAI/Mnu3T9XCPbcuWJVHiNZu2
CubNKFb9H+HdjRZlgdT7dpx0XITCNWziQmBpQZO+w1kwa3s5uv5qNwj8I30HyTFH
qAArMq2Yb7GIW4PR8IpbSdCEWZii4YcIiLYk3kiEMyC2UIWeznBq+j79jFyITd9T
beevhLLq7dMoLO8T2VRSdp4YhdLKNYmgjQG1YFh3YIh8s/tyPQ0OsDaxNk5z8wgt
/htWiVQ2X/xWRHBnVKT4olc97pXqTvtaCdBWjD/fKNZK50YIwAjr2cJVoXpJidAW
POwwMto7rjSJDYpE6IgqQnhz0VqRTNB/4QkyaeHJkt6BPhAaafkrHZYtDgc2GaUJ
G96J6xT9bwHn8Jz1tpsxSgSA9A2GfBhdOUKMpIUVn1l9Q1ELPA0dJzzagLhgRDRG
w97UQ+CJqvRNfJkRHcuJbNqP5i++6yOHcHGB8QLkmkSYVcCm9Hj/fMXj/DJ5tHt/
c9t6uw25eqI7raFcxfh09+QTX8VDV96fa+GTo1HnCH71nGG2GqCoJFXyb3HEW64s
gD7uOPOlFlqfpuGDmV1kmdwH0R15EIJb2b6QsDssrcHEg5fA6z/xcAcV839c9y47
U8tcBiqFF0vWgB1QljTDrKuNsjg8dIeZfkSekBGD0WuBVLSADBl7f3b2k3BZB5H3
WxosyR4ufSVfw53HaDsb
=6eIT
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Nothing that really stands out, just a bunch of fixes that have come
in in the last couple of weeks.
None of these are actually fixes for code that is new in 4.13. It's
roughly half older bugs, with fixes going to stable, and half
fixes/updates for Power9.
Thanks to: Aneesh Kumar K.V, Anton Blanchard, Balbir Singh, Benjamin
Herrenschmidt, Madhavan Srinivasan, Michael Neuling, Nicholas Piggin,
Oliver O'Halloran"
* tag 'powerpc-4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64: Fix atomic64_inc_not_zero() to return an int
powerpc: Fix emulation of mfocrf in emulate_step()
powerpc: Fix emulation of mcrf in emulate_step()
powerpc/perf: Add POWER9 alternate PM_RUN_CYC and PM_RUN_INST_CMPL events
powerpc/perf: Fix SDAR_MODE value for continous sampling on Power9
powerpc/asm: Mark cr0 as clobbered in mftb()
powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9
powerpc/mm/radix: Synchronize updates to the process table
powerpc/mm/radix: Properly clear process table entry
powerpc/powernv: Tell OPAL about our MMU mode on POWER9
powerpc/kexec: Fix radix to hash kexec due to IAMR/AMOR
The arch uses a verbatim copy of the asm-generic version and does not
add any own implementations to the header, so use asm-generic/fb.h
instead of duplicating code.
Link: http://lkml.kernel.org/r/20170517083545.2115-1-tklauser@distanz.ch
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A large sun4v SPARC system may have moments of intensive xcall activities,
usually caused by unmapping many pages on many CPUs concurrently. This can
flood receivers with CPU mondo interrupts for an extended period, causing
some unlucky senders to hit send-mondo timeout. This problem gets worse
as cpu count increases because sometimes mappings must be invalidated on
all CPUs, and sometimes all CPUs may gang up on a single CPU.
But a busy system is not a broken system. In the above scenario, as long
as the receiver is making forward progress processing mondo interrupts,
the sender should continue to retry.
This patch implements the receiver's forward progress meter by introducing
a per cpu counter 'cpu_mondo_counter[cpu]' where 'cpu' is in the range
of 0..NR_CPUS. The receiver increments its counter as soon as it receives
a mondo and the sender tracks the receiver's counter. If the receiver has
stopped making forward progress when the retry limit is reached, the sender
declares send-mondo-timeout and panic; otherwise, the receiver is allowed
to keep making forward progress.
In addition, it's been observed that PCIe hotplug events generate Correctable
Errors that are handled by hypervisor and then OS. Hypervisor 'borrows'
a guest cpu strand briefly to provide the service. If the cpu strand is
simultaneously the only cpu targeted by a mondo, it may not be available
for the mondo in 20msec, causing SUN4V mondo timeout. It appears that 1 second
is the agreed wait time between hypervisor and guest OS, this patch makes
the adjustment.
Orabug: 25476541
Orabug: 26417466
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Reviewed-by: Thomas Tai <thomas.tai@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hyper-V identifies vCPUs by Virtual Processor Index, which can be
queried via HV_X64_MSR_VP_INDEX msr. It is defined by the spec as a
sequential number which can't exceed the maximum number of vCPUs per VM.
APIC ids can be sparse and thus aren't a valid replacement for VP
indices.
Current KVM uses its internal vcpu index as VP_INDEX. However, to make
it predictable and persistent across VM migrations, the userspace has to
control the value of VP_INDEX.
This patch achieves that, by storing vp_index explicitly on vcpu, and
allowing HV_X64_MSR_VP_INDEX to be set from the host side. For
compatibility it's initialized to KVM vcpu index. Also a few variables
are renamed to make clear distinction betweed this Hyper-V vp_index and
KVM vcpu_id (== APIC id). Besides, a new capability,
KVM_CAP_HYPERV_VP_INDEX, is added to allow the userspace to skip
attempting msr writes where unsupported, to avoid spamming error logs.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Adds another flag bit (bit 2) to MSR_KVM_ASYNC_PF_EN. If bit 2 is 1,
async page faults are delivered to L1 as #PF vmexits; if bit 2 is 0,
kvm_can_do_async_pf returns 0 if in guest mode.
This is similar to what svm.c wanted to do all along, but it is only
enabled for Linux as L1 hypervisor. Foreign hypervisors must never
receive async page faults as vmexits, because they'd probably be very
confused about that.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Add an nested_apf field to vcpu->arch.exception to identify an async page
fault, and constructs the expected vm-exit information fields. Force a
nested VM exit from nested_vmx_check_exception() if the injected #PF is
async page fault.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
This patch adds the L1 guest async page fault #PF vmexit handler, such
by L1 similar to ordinary async page fault.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Passed insn parameters to kvm_mmu_page_fault().]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
This patch removes all arguments except the first in
kvm_x86_ops->queue_exception since they can extract the arguments from
vcpu->arch.exception themselves.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>