linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-20 10:44:23 +08:00

Author	SHA1	Message	Date
Linus Torvalds	933425fb00	s390: A bunch of fixes and optimizations for interrupt and time handling. PPC: Mostly bug fixes. ARM: No big features, but many small fixes and prerequisites including: - a number of fixes for the arch-timer - introducing proper level-triggered semantics for the arch-timers - a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding) - some tracepoint improvements - a tweak for the EL2 panic handlers - some more VGIC cleanups getting rid of redundant state x86: quite a few changes: - support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). This introduces a new component (in virt/lib/) that connects VFIO and KVM together. The same infrastructure will be used for ARM interrupt forwarding as well. - more Hyper-V features, though the main one Hyper-V synthetic interrupt controller will have to wait for 4.5. These will let KVM expose Hyper-V devices. - nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster - for future hardware that supports NVDIMM, there is support for clflushopt, clwb, pcommit - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor - obligatory smattering of SMM fixes - on the guest side, stable scheduler clock support was rewritten to not require help from the hypervisor. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWO2IQAAoJEL/70l94x66D/K0H/3AovAgYmJQToZlimsktMk6a f2xhdIqfU5lIQQh5uNBCfL3o9o8H9Py1ym7aEw3fmztPHHJYc91oTatt2UEKhmEw VtZHp/dFHt3hwaIdXmjRPEXiYctraKCyrhaUYdWmUYkoKi7lW5OL5h+S7frG2U6u p/hFKnHRZfXHr6NSgIqvYkKqtnc+C0FWY696IZMzgCksOO8jB1xrxoSN3tANW3oJ PDV+4og0fN/Fr1capJUFEc/fejREHneANvlKrLaa8ht0qJQutoczNADUiSFLcMPG iHljXeDsv5eyjMtUuIL8+MPzcrIt/y4rY41ZPiKggxULrXc6H+JJL/e/zThZpXc= =iv2z -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "First batch of KVM changes for 4.4. s390: A bunch of fixes and optimizations for interrupt and time handling. PPC: Mostly bug fixes. ARM: No big features, but many small fixes and prerequisites including: - a number of fixes for the arch-timer - introducing proper level-triggered semantics for the arch-timers - a series of patches to synchronously halt a guest (prerequisite for IRQ forwarding) - some tracepoint improvements - a tweak for the EL2 panic handlers - some more VGIC cleanups getting rid of redundant state x86: Quite a few changes: - support for VT-d posted interrupts (i.e. PCI devices can inject interrupts directly into vCPUs). This introduces a new component (in virt/lib/) that connects VFIO and KVM together. The same infrastructure will be used for ARM interrupt forwarding as well. - more Hyper-V features, though the main one Hyper-V synthetic interrupt controller will have to wait for 4.5. These will let KVM expose Hyper-V devices. - nested virtualization now supports VPID (same as PCID but for vCPUs) which makes it quite a bit faster - for future hardware that supports NVDIMM, there is support for clflushopt, clwb, pcommit - support for "split irqchip", i.e. LAPIC in kernel + IOAPIC/PIC/PIT in userspace, which reduces the attack surface of the hypervisor - obligatory smattering of SMM fixes - on the guest side, stable scheduler clock support was rewritten to not require help from the hypervisor" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (123 commits) KVM: VMX: Fix commit which broke PML KVM: x86: obey KVM_X86_QUIRK_CD_NW_CLEARED in kvm_set_cr0() KVM: x86: allow RSM from 64-bit mode KVM: VMX: fix SMEP and SMAP without EPT KVM: x86: move kvm_set_irq_inatomic to legacy device assignment KVM: device assignment: remove pointless #ifdefs KVM: x86: merge kvm_arch_set_irq with kvm_set_msi_inatomic KVM: x86: zero apic_arb_prio on reset drivers/hv: share Hyper-V SynIC constants with userspace KVM: x86: handle SMBASE as physical address in RSM KVM: x86: add read_phys to x86_emulate_ops KVM: x86: removing unused variable KVM: don't pointlessly leave KVM_COMPAT=y in non-KVM configs KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr() KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings KVM: arm/arm64: Optimize away redundant LR tracking KVM: s390: use simple switch statement as multiplexer KVM: s390: drop useless newline in debugging data KVM: s390: SCA must not cross page boundaries KVM: arm: Do not indent the arguments of DECLARE_BITMAP ...	2015-11-05 16:26:26 -08:00
Linus Torvalds	0d51ce9ca1	Power management and ACPI updates for v4.4-rc1 - ACPICA update to upstream revision 20150930 (Bob Moore, Lv Zheng). The most significant change is to allow the AML debugger to be built into the kernel. On top of that there is an update related to the NFIT table (the ACPI persistent memory interface) and a few fixes and cleanups. - ACPI CPPC2 (Collaborative Processor Performance Control v2) support along with a cpufreq frontend (Ashwin Chaugule). This can only be enabled on ARM64 at this point. - New ACPI infrastructure for the early probing of IRQ chips and clock sources (Marc Zyngier). - Support for a new hierarchical properties extension of the ACPI _DSD (Device Specific Data) device configuration object allowing the kernel to handle hierarchical properties (provided by the platform firmware this way) automatically and make them available to device drivers via the generic device properties interface (Rafael Wysocki). - Generic device properties API extension to obtain an index of certain string value in an array of strings, along the lines of of_property_match_string(), but working for all of the supported firmware node types, and support for the "dma-names" device property based on it (Mika Westerberg). - ACPI core fix to parse the MADT (Multiple APIC Description Table) entries in the order expected by platform firmware (and mandated by the specification) to avoid confusion on systems with more than 255 logical CPUs (Lukasz Anaczkowski). - Consolidation of the ACPI-based handling of PCI host bridges on x86 and ia64 (Jiang Liu). - ACPI core fixes to ensure that the correct IRQ number is used to represent the SCI (System Control Interrupt) in the cases when it has been re-mapped (Chen Yu). - New ACPI backlight quirk for Lenovo IdeaPad S405 (Hans de Goede). - ACPI EC driver fixes (Lv Zheng). - Assorted ACPI fixes and cleanups (Dan Carpenter, Insu Yun, Jiri Kosina, Rami Rosen, Rasmus Villemoes). - New mechanism in the PM core allowing drivers to check if the platform firmware is going to be involved in the upcoming system suspend or if it has been involved in the suspend the system is resuming from at the moment (Rafael Wysocki). This should allow drivers to optimize their suspend/resume handling in some cases and the changes include a couple of users of it (the i8042 input driver, PCI PM). - PCI PM fix to prevent runtime-suspended devices with PME enabled from being resumed during system suspend even if they aren't configured to wake up the system from sleep (Rafael Wysocki). - New mechanism to report the number of a wakeup IRQ that woke up the system from sleep last time (Alexandra Yates). - Removal of unused interfaces from the generic power domains framework and fixes related to latency measurements in that code (Ulf Hansson, Daniel Lezcano). - cpufreq core sysfs interface rework to make it handle CPUs that share performance scaling settings (represented by a common cpufreq policy object) more symmetrically (Viresh Kumar). This should help to simplify the CPU offline/online handling among other things. - cpufreq core fixes and cleanups (Viresh Kumar). - intel_pstate fixes related to the Turbo Activation Ratio (TAR) mechanism on client platforms which causes the turbo P-states range to vary depending on platform firmware settings (Srinivas Pandruvada). - intel_pstate sysfs interface fix (Prarit Bhargava). - Assorted cpufreq driver (imx, tegra20, powernv, integrator) fixes and cleanups (Bai Ping, Bartlomiej Zolnierkiewicz, Shilpasri G Bhat, Luis de Bethencourt). - cpuidle mvebu driver cleanups (Russell King). - OPP (Operating Performance Points) framework code reorganization to make it more maintainable (Viresh Kumar). - Intel Broxton support for the RAPL (Running Average Power Limits) power capping driver (Amy Wiles). - Assorted power management code fixes and cleanups (Dan Carpenter, Geert Uytterhoeven, Geliang Tang, Luis de Bethencourt, Rasmus Villemoes). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJWOC9oAAoJEILEb/54YlRx/c8P/joflwoFsISwJccG62YTQMuc bMQKM4Kw0vl5La8+pkLpe5t6+mW7l81UFtYF6Dzd8LOKlD9sszD34z1lHmCeT/oR wn0uZpHagRyLMUfoyiEtlU/VRU6WQNNtS3EgjwUi7xgFz9Q0pjcCZ9OQ6vKov1j5 +6j40ODif5sgo+2vl+rztJiV0SIMkYdkgNqgfN1FE9bdLA2Zkk+PxxJbtGQORuDu O/K+XhQT2xWquVWi/1p+VtQxs5glBS1oKm0kogV5bElCvNTRNIVABUNcjogITQwo QSAKgoCKIoaIl5jtDT6u5dc0y67q/dMtqOY9fOCcOz1Z7jbWQzR8D7mpFWIsJUPK K2LClI3t85ynpN6Jref246A6+C9nwB8JMAiAR11oBw7WbBlkd6tbRgcT5B+iz8UE FuCCif7pha/Fs+Jt1YRazscIqteQ2bAhhxikuIPMfw2M6M67MNfVNeKA1bAoSM34 dH7JsilblitvV7shrwJHwXPXCOF2jEPoK8I4/q2+TR5qUxEpRJjelQxXGSaQScMZ iNnjeTgv8H8q+rY5Yjzsl4pxP0Fvf7IuqkptWOJbgepg4cQc9pS87wOpY3uEeQzr H7ruaQJFCnLO4aXbPNClsiJARhrBk+qMlsh4vBEyCJ2T0ucb+nIUcN4BTi8t85yl X97BfHHUiDoUrnIsNids =1gaH -----END PGP SIGNATURE----- Merge tag 'pm+acpi-4.4-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael Wysocki: "Quite a new features are included this time. First off, the Collaborative Processor Performance Control interface (version 2) defined by ACPI will now be supported on ARM64 along with a cpufreq frontend for CPU performance scaling. Second, ACPI gets a new infrastructure for the early probing of IRQ chips and clock sources (along the lines of the existing similar mechanism for DT). Next, the ACPI core and the generic device properties API will now support a recently introduced hierarchical properties extension of the _DSD (Device Specific Data) ACPI device configuration object. If the ACPI platform firmware uses that extension to organize device properties in a hierarchical way, the kernel will automatically handle it and make those properties available to device drivers via the generic device properties API. It also will be possible to build the ACPICA's AML interpreter debugger into the kernel now and use that to diagnose AML-related problems more efficiently. In the future, this should make it possible to single-step AML execution and do similar things. Interesting stuff, although somewhat experimental at this point. Finally, the PM core gets a new mechanism that can be used by device drivers to distinguish between suspend-to-RAM (based on platform firmware support) and suspend-to-idle (or other variants of system suspend the platform firmware is not involved in) and possibly optimize their device suspend/resume handling accordingly. In addition to that, some existing features are re-organized quite substantially. First, the ACPI-based handling of PCI host bridges on x86 and ia64 is unified and the common code goes into the ACPI core (so as to reduce code duplication and eliminate non-essential differences between the two architectures in that area). Second, the Operating Performance Points (OPP) framework is reorganized to make the code easier to find and follow. Next, the cpufreq core's sysfs interface is reorganized to get rid of the "primary CPU" concept for configurations in which the same performance scaling settings are shared between multiple CPUs. Finally, some interfaces that aren't necessary any more are dropped from the generic power domains framework. On top of the above we have some minor extensions, cleanups and bug fixes in multiple places, as usual. Specifics: - ACPICA update to upstream revision 20150930 (Bob Moore, Lv Zheng). The most significant change is to allow the AML debugger to be built into the kernel. On top of that there is an update related to the NFIT table (the ACPI persistent memory interface) and a few fixes and cleanups. - ACPI CPPC2 (Collaborative Processor Performance Control v2) support along with a cpufreq frontend (Ashwin Chaugule). This can only be enabled on ARM64 at this point. - New ACPI infrastructure for the early probing of IRQ chips and clock sources (Marc Zyngier). - Support for a new hierarchical properties extension of the ACPI _DSD (Device Specific Data) device configuration object allowing the kernel to handle hierarchical properties (provided by the platform firmware this way) automatically and make them available to device drivers via the generic device properties interface (Rafael Wysocki). - Generic device properties API extension to obtain an index of certain string value in an array of strings, along the lines of of_property_match_string(), but working for all of the supported firmware node types, and support for the "dma-names" device property based on it (Mika Westerberg). - ACPI core fix to parse the MADT (Multiple APIC Description Table) entries in the order expected by platform firmware (and mandated by the specification) to avoid confusion on systems with more than 255 logical CPUs (Lukasz Anaczkowski). - Consolidation of the ACPI-based handling of PCI host bridges on x86 and ia64 (Jiang Liu). - ACPI core fixes to ensure that the correct IRQ number is used to represent the SCI (System Control Interrupt) in the cases when it has been re-mapped (Chen Yu). - New ACPI backlight quirk for Lenovo IdeaPad S405 (Hans de Goede). - ACPI EC driver fixes (Lv Zheng). - Assorted ACPI fixes and cleanups (Dan Carpenter, Insu Yun, Jiri Kosina, Rami Rosen, Rasmus Villemoes). - New mechanism in the PM core allowing drivers to check if the platform firmware is going to be involved in the upcoming system suspend or if it has been involved in the suspend the system is resuming from at the moment (Rafael Wysocki). This should allow drivers to optimize their suspend/resume handling in some cases and the changes include a couple of users of it (the i8042 input driver, PCI PM). - PCI PM fix to prevent runtime-suspended devices with PME enabled from being resumed during system suspend even if they aren't configured to wake up the system from sleep (Rafael Wysocki). - New mechanism to report the number of a wakeup IRQ that woke up the system from sleep last time (Alexandra Yates). - Removal of unused interfaces from the generic power domains framework and fixes related to latency measurements in that code (Ulf Hansson, Daniel Lezcano). - cpufreq core sysfs interface rework to make it handle CPUs that share performance scaling settings (represented by a common cpufreq policy object) more symmetrically (Viresh Kumar). This should help to simplify the CPU offline/online handling among other things. - cpufreq core fixes and cleanups (Viresh Kumar). - intel_pstate fixes related to the Turbo Activation Ratio (TAR) mechanism on client platforms which causes the turbo P-states range to vary depending on platform firmware settings (Srinivas Pandruvada). - intel_pstate sysfs interface fix (Prarit Bhargava). - Assorted cpufreq driver (imx, tegra20, powernv, integrator) fixes and cleanups (Bai Ping, Bartlomiej Zolnierkiewicz, Shilpasri G Bhat, Luis de Bethencourt). - cpuidle mvebu driver cleanups (Russell King). - OPP (Operating Performance Points) framework code reorganization to make it more maintainable (Viresh Kumar). - Intel Broxton support for the RAPL (Running Average Power Limits) power capping driver (Amy Wiles). - Assorted power management code fixes and cleanups (Dan Carpenter, Geert Uytterhoeven, Geliang Tang, Luis de Bethencourt, Rasmus Villemoes)" * tag 'pm+acpi-4.4-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (108 commits) cpufreq: postfix policy directory with the first CPU in related_cpus cpufreq: create cpu/cpufreq/policyX directories cpufreq: remove cpufreq_sysfs_{create\|remove}_file() cpufreq: create cpu/cpufreq at boot time cpufreq: Use cpumask_copy instead of cpumask_or to copy a mask cpufreq: ondemand: Drop unnecessary locks from update_sampling_rate() PM / Domains: Merge measurements for PM QoS device latencies PM / Domains: Don't measure ->start\|stop() latency in system PM callbacks PM / clk: Fix broken build due to non-matching code and header #ifdefs ACPI / Documentation: add copy_dsdt to ACPI format options ACPI / sysfs: correctly check failing memory allocation ACPI / video: Add a quirk to force native backlight on Lenovo IdeaPad S405 ACPI / CPPC: Fix potential memory leak ACPI / CPPC: signedness bug in register_pcc_channel() ACPI / PAD: power_saving_thread() is not freezable ACPI / PM: Fix incorrect wakeup IRQ setting during suspend-to-idle ACPI: Using correct irq when waiting for events ACPI: Use correct IRQ when uninstalling ACPI interrupt handler cpuidle: mvebu: disable the bind/unbind attributes and use builtin_platform_driver cpuidle: mvebu: clean up multiple platform drivers ...	2015-11-04 18:10:13 -08:00
Linus Torvalds	4302d506d5	Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 sigcontext header cleanups from Ingo Molnar: "This series reorganizes and cleans up various aspects of the main sigcontext UAPI headers, such as unifying the data structures and updating/adding lots of comments to explain all the ABI details and quirks. The headers can now also be built in user-space standalone" * 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/headers: Clean up too long lines x86/headers: Remove <asm/sigcontext.h> references on the kernel side x86/headers: Remove direct sigcontext32.h uses x86/headers: Convert sigcontext_ia32 uses to sigcontext_32 x86/headers: Unify 'struct sigcontext_ia32' and 'struct sigcontext_32' x86/headers: Make sigcontext pointers bit independent x86/headers: Move the 'struct sigcontext' definitions into the UAPI header x86/headers: Clean up the kernel's struct sigcontext types to be ABI-clean x86/headers: Convert uses of _fpstate_ia32 to _fpstate_32 x86/headers: Unify 'struct _fpstate_ia32' and i386 struct _fpstate x86/headers: Unify register type definitions between 32-bit compat and i386 x86/headers: Use ABI types consistently in sigcontext*.h x86/headers: Separate out legacy user-space structure definitions x86/headers: Clean up and better document uapi/asm/sigcontext.h x86/headers: Clean up uapi/asm/sigcontext32.h x86/headers: Fix (old) header file dependency bug in uapi/asm/sigcontext32.h	2015-11-03 21:05:40 -08:00
Linus Torvalds	ce4d72fac1	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu changes from Ingo Molnar: "There are two main areas of changes: - Rework of the extended FPU state code to robustify the kernel's usage of cpuid provided xstate sizes - and related changes (Dave Hansen)" - math emulation enhancements: new modern FPU instructions support, with testcases, plus cleanups (Denys Vlasnko)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/fpu: Fixup uninitialized feature_name warning x86/fpu/math-emu: Add support for FISTTP instructions x86/fpu/math-emu, selftests: Add test for FISTTP instructions x86/fpu/math-emu: Add support for FCMOVcc insns x86/fpu/math-emu: Add support for F[U]COMI[P] insns x86/fpu/math-emu: Remove define layer for undocumented opcodes x86/fpu/math-emu, selftests: Add tests for FCMOV and FCOMI insns x86/fpu/math-emu: Remove !NO_UNDOC_CODE x86/fpu: Check CPU-provided sizes against struct declarations x86/fpu: Check to ensure increasing-offset xstate offsets x86/fpu: Correct and check XSAVE xstate size calculations x86/fpu: Add C structures for AVX-512 state components x86/fpu: Rework YMM definition x86/fpu/mpx: Rework MPX 'xstate' types x86/fpu: Add xfeature_enabled() helper instead of test_bit() x86/fpu: Remove 'xfeature_nr' x86/fpu: Rework XSTATE_* macros to remove magic '2' x86/fpu: Rename XFEATURES_NR_MAX x86/fpu: Rename XSAVE macros x86/fpu: Remove partial LWP support definitions ...	2015-11-03 20:50:26 -08:00
Linus Torvalds	0f25f2c1b1	Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 kgdb fixlet from Ingo Molnar: "A single debugging related commit: compress the memory usage of a kgdb data structure" * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kgdb: Replace bool_int_array[NR_CPUS] with bitmap	2015-11-03 20:12:10 -08:00
Linus Torvalds	f323c49b30	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu changes from Ingo Molnar: "Two changes in this cycle: a Kconfig help text enhancement, and an AMD CLZERO instruction capability detection and enumeration" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add CLZERO detection x86/Kconfig/cpus: Fix/complete CPU type help texts	2015-11-03 19:39:42 -08:00
Linus Torvalds	33d46f9765	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: "An early_printk cleanup plus deinlining enhancements" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/early_printk: Set __iomem address space for IO x86/signal: Deinline get_sigframe, save 240 bytes x86: Deinline early_console_register, save 403 bytes x86/e820: Deinline e820_type_to_string, save 126 bytes	2015-11-03 19:34:22 -08:00
Linus Torvalds	378e4e9825	Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot cleanup from Ingo Molnar: "A single commit: remove an obsolete kcrash boot flag" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kexec: Remove obsolete 'in_crash_kexec' flag	2015-11-03 19:28:37 -08:00
Linus Torvalds	a75a3f6fc9	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm changes from Ingo Molnar: "The main change in this cycle is another step in the big x86 system call interface rework by Andy Lutomirski, which moves most of the low level x86 entry code from assembly to C, for all syscall entries except native 64-bit system calls: arch/x86/entry/entry_32.S \| 182 ++++------ arch/x86/entry/entry_64_compat.S \| 547 ++++++++----------------------- 194 insertions(+), 535 deletions(-) ... our hope is that the final remaining step (converting native 64-bit system calls) will be less painful as all the previous steps, given that most of the legacies and quirks are concentrated around native 32-bit and compat environments" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) x86/entry/32: Fix FS and GS restore in opportunistic SYSEXIT x86/entry/32: Fix entry_INT80_32() to expect interrupts to be on um/x86: Fix build after x86 syscall changes x86/asm: Remove the xyz_cfi macros from dwarf2.h selftests/x86: Style fixes for the 'unwind_vdso' test x86/entry/64/compat: Document sysenter_fix_flags's reason for existence x86/entry: Split and inline syscall_return_slowpath() x86/entry: Split and inline prepare_exit_to_usermode() x86/entry: Use pt_regs_to_thread_info() in syscall entry tracing x86/entry: Hide two syscall entry assertions behind CONFIG_DEBUG_ENTRY x86/entry: Micro-optimize compat fast syscall arg fetch x86/entry: Force inlining of 32-bit syscall code x86/entry: Make irqs_disabled checks in exit code depend on lockdep x86/entry: Remove unnecessary IRQ twiddling in fast 32-bit syscalls x86/asm: Remove thread_info.sysenter_return x86/entry/32: Re-implement SYSENTER using the new C path x86/entry/32: Switch INT80 to the new C syscall path x86/entry/32: Open-code return tracking from fork and kthreads x86/entry/compat: Implement opportunistic SYSRETL for compat syscalls x86/vdso/compat: Wire up SYSENTER and SYSCSALL for compat userspace ...	2015-11-03 18:59:10 -08:00
Linus Torvalds	d2bea739f8	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic changes from Ingo Molnar: "The main changes in this cycle were: - Numachip updates: new hardware support, fixes and cleanups. (Daniel J Blueman) - misc smaller cleanups and fixlets" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/io_apic: Make eoi_ioapic_pin() static x86/irq: Drop unlikely before IS_ERR_OR_NULL x86/x2apic: Make stub functions available even if !CONFIG_X86_LOCAL_APIC x86/apic: Deinline various functions x86/numachip: Fix timer build conflict x86/numachip: Introduce Numachip2 timer mechanisms x86/numachip: Add Numachip IPI optimisations x86/numachip: Add Numachip2 APIC support x86/numachip: Cleanup Numachip support	2015-11-03 18:33:15 -08:00
Linus Torvalds	53528695ff	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "The main changes in this cycle were: - sched/fair load tracking fixes and cleanups (Byungchul Park) - Make load tracking frequency scale invariant (Dietmar Eggemann) - sched/deadline updates (Juri Lelli) - stop machine fixes, cleanups and enhancements for bugs triggered by CPU hotplug stress testing (Oleg Nesterov) - scheduler preemption code rework: remove PREEMPT_ACTIVE and related cleanups (Peter Zijlstra) - Rework the sched_info::run_delay code to fix races (Peter Zijlstra) - Optimize per entity utilization tracking (Peter Zijlstra) - ... misc other fixes, cleanups and smaller updates" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) sched: Don't scan all-offline ->cpus_allowed twice if !CONFIG_CPUSETS sched: Move cpu_active() tests from stop_two_cpus() into migrate_swap_stop() sched: Start stopper early stop_machine: Kill cpu_stop_threads->setup() and cpu_stop_unpark() stop_machine: Kill smp_hotplug_thread->pre_unpark, introduce stop_machine_unpark() stop_machine: Change cpu_stop_queue_two_works() to rely on stopper->enabled stop_machine: Introduce __cpu_stop_queue_work() and cpu_stop_queue_two_works() stop_machine: Ensure that a queued callback will be called before cpu_stop_park() sched/x86: Fix typo in __switch_to() comments sched/core: Remove a parameter in the migrate_task_rq() function sched/core: Drop unlikely behind BUG_ON() sched/core: Fix task and run queue sched_info::run_delay inconsistencies sched/numa: Fix task_tick_fair() from disabling numa_balancing sched/core: Add preempt_count invariant check sched/core: More notrace annotations sched/core: Kill PREEMPT_ACTIVE sched/core, sched/x86: Kill thread_info::saved_preempt_count sched/core: Simplify preempt_count tests sched/core: Robustify preemption leak checks sched/core: Stop setting PREEMPT_ACTIVE ...	2015-11-03 18:03:50 -08:00
Linus Torvalds	b831ef2cad	Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS changes from Ingo Molnar: "The main system reliability related changes were from x86, but also some generic RAS changes: - AMD MCE error injection subsystem enhancements. (Aravind Gopalakrishnan) - Fix MCE and CPU hotplug interaction bug. (Ashok Raj) - kcrash bootup robustness fix. (Baoquan He) - kcrash cleanups. (Borislav Petkov) - x86 microcode driver rework: simplify it by unmodularizing it and other cleanups. (Borislav Petkov)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/mce: Add a default case to the switch in __mcheck_cpu_ancient_init() x86/mce: Add a Scalable MCA vendor flags bit MAINTAINERS: Unify the microcode driver section x86/microcode/intel: Move #ifdef DEBUG inside the function x86/microcode/amd: Remove maintainers from comments x86/microcode: Remove modularization leftovers x86/microcode: Merge the early microcode loader x86/microcode: Unmodularize the microcode driver x86/mce: Fix thermal throttling reporting after kexec kexec/crash: Say which char is the unrecognized x86/setup/crash: Check memblock_reserve() retval x86/setup/crash: Cleanup some more x86/setup/crash: Remove alignment variable x86/setup: Cleanup crashkernel reservation functions x86/amd_nb, EDAC: Rename amd_get_node_id() x86/setup: Do not reserve crashkernel high memory if low reservation failed x86/microcode/amd: Do not overwrite final patch levels x86/microcode/amd: Extract current patch level read to a function x86/ras/mce_amd_inj: Inject bank 4 errors on the NBC x86/ras/mce_amd_inj: Trigger deferred and thresholding errors interrupts ...	2015-11-03 17:51:33 -08:00
Linus Torvalds	b02ac6b18c	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side changes: - Improve accuracy of perf/sched clock on x86. (Adrian Hunter) - Intel DS and BTS updates. (Alexander Shishkin) - Intel cstate PMU support. (Kan Liang) - Add group read support to perf_event_read(). (Peter Zijlstra) - Branch call hardware sampling support, implemented on x86 and PowerPC. (Stephane Eranian) - Event groups transactional interface enhancements. (Sukadev Bhattiprolu) - Enable proper x86/intel/uncore PMU support on multi-segment PCI systems. (Taku Izumi) - ... misc fixes and cleanups. The perf tooling team was very busy again with 200+ commits, the full diff doesn't fit into lkml size limits. Here's an (incomplete) list of the tooling highlights: New features: - Change the default event used in all tools (record/top): use the most precise "cycles" hw counter available, i.e. when the user doesn't specify any event, it will try using cycles:ppp, cycles:pp, etc and fall back transparently until it finds a working counter. (Arnaldo Carvalho de Melo) - Integration of perf with eBPF that, given an eBPF .c source file (or .o file built for the 'bpf' target with clang), will get it automatically built, validated and loaded into the kernel via the sys_bpf syscall, which can then be used and seen using 'perf trace' and other tools. (Wang Nan) Various user interface improvements: - Automatic pager invocation on long help output. (Namhyung Kim) - Search for more options when passing args to -h, e.g.: (Arnaldo Carvalho de Melo) $ perf report -h interface Usage: perf report [<options>] --gtk Use the GTK2 interface --stdio Use the stdio interface --tui Use the TUI interface - Show ordered command line options when -h is used or when an unknown option is specified. (Arnaldo Carvalho de Melo) - If options are passed after -h, show just its descriptions, not all options. (Arnaldo Carvalho de Melo) - Implement column based horizontal scrolling in the hists browser (top, report), making it possible to use the TUI for things like 'perf mem report' where there are many more columns than can fit in a terminal. (Arnaldo Carvalho de Melo) - Enhance the error reporting of tracepoint event parsing, e.g.: $ oldperf record -e sched:sched_switc usleep 1 event syntax error: 'sched:sched_switc' \___ unknown tracepoint Run 'perf list' for a list of valid events Now we get the much nicer: $ perf record -e sched:sched_switc ls event syntax error: 'sched:sched_switc' \___ can't access trace events Error: No permissions to read /sys/kernel/debug/tracing/events/sched/sched_switc Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug' And after we have those mount point permissions fixed: $ perf record -e sched:sched_switc ls event syntax error: 'sched:sched_switc' \___ unknown tracepoint Error: File /sys/kernel/debug/tracing/events/sched/sched_switc not found. Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?. I.e. basically now the event parsing routing uses the strerror_open() routines introduced by and used in 'perf trace' work. (Jiri Olsa) - Fail properly when pattern matching fails to find a tracepoint, i.e. '-e non:existent' was being correctly handled, with a proper error message about that not being a valid event, but '-e non:existent' wasn't, fix it. (Jiri Olsa) - Do event name substring search as last resort in 'perf list'. (Arnaldo Carvalho de Melo) E.g.: # perf list clock List of pre-defined events (to be used in -e): cpu-clock [Software event] task-clock [Software event] uncore_cbox_0/clockticks/ [Kernel PMU event] uncore_cbox_1/clockticks/ [Kernel PMU event] kvm:kvm_pvclock_update [Tracepoint event] kvm:kvm_update_master_clock [Tracepoint event] power:clock_disable [Tracepoint event] power:clock_enable [Tracepoint event] power:clock_set_rate [Tracepoint event] syscalls:sys_enter_clock_adjtime [Tracepoint event] syscalls:sys_enter_clock_getres [Tracepoint event] syscalls:sys_enter_clock_gettime [Tracepoint event] syscalls:sys_enter_clock_nanosleep [Tracepoint event] syscalls:sys_enter_clock_settime [Tracepoint event] syscalls:sys_exit_clock_adjtime [Tracepoint event] syscalls:sys_exit_clock_getres [Tracepoint event] syscalls:sys_exit_clock_gettime [Tracepoint event] syscalls:sys_exit_clock_nanosleep [Tracepoint event] syscalls:sys_exit_clock_settime [Tracepoint event] Intel PT hardware tracing enhancements: - Accept a zero --itrace period, meaning "as often as possible". In the case of Intel PT that is the same as a period of 1 and a unit of 'instructions' (i.e. --itrace=i1i). (Adrian Hunter) - Harmonize itrace's synthesized callchains with the existing --max-stack tool option. (Adrian Hunter) - Allow time to be displayed in nanoseconds in 'perf script'. (Adrian Hunter) - Fix potential infinite loop when handling Intel PT timestamps. (Adrian Hunter) - Slighly improve Intel PT debug logging. (Adrian Hunter) - Warn when AUX data has been lost, just like when processing PERF_RECORD_LOST. (Adrian Hunter) - Further document export-to-postgresql.py script. (Adrian Hunter) - Add option to synthesize branch stack from auxtrace data. (Adrian Hunter) Misc notable changes: - Switch the default callchain output mode to 'graph,0.5,caller', to make it look like the default for other tools, reducing the learning curve for people used to 'caller' based viewing. (Arnaldo Carvalho de Melo) - various call chain usability enhancements. (Namhyung Kim) - Introduce the 'P' event modifier, meaning 'max precision level, please', i.e.: $ perf record -e cycles:P usleep 1 Is now similar to: $ perf record usleep 1 Useful, for instance, when specifying multiple events. (Jiri Olsa) - Add 'socket' sort entry, to sort by the processor socket in 'perf top' and 'perf report'. (Kan Liang) - Introduce --socket-filter to 'perf report', for filtering by processor socket. (Kan Liang) - Add new "Zoom into Processor Socket" operation in the perf hists browser, used in 'perf top' and 'perf report'. (Kan Liang) - Allow probing on kmodules without DWARF. (Masami Hiramatsu) - Fix 'perf probe -l' for probes added to kernel module functions. (Masami Hiramatsu) - Preparatory work for the 'perf stat record' feature that will allow generating perf.data files with counting data in addition to the sampling mode we have now (Jiri Olsa) - Update libtraceevent KVM plugin. (Paolo Bonzini) - ... plus lots of other enhancements that I failed to list properly, by: Adrian Hunter, Alexander Shishkin, Andi Kleen, Andrzej Hajda, Arnaldo Carvalho de Melo, Dima Kogan, Don Zickus, Geliang Tang, He Kuang, Huaitong Han, Ingo Molnar, Jan Stancek, Jiri Olsa, Kan Liang, Kirill Tkhai, Masami Hiramatsu, Matt Fleming, Namhyung Kim, Paolo Bonzini, Peter Zijlstra, Rabin Vincent, Scott Wood, Stephane Eranian, Sukadev Bhattiprolu, Taku Izumi, Vaishali Thakkar, Wang Nan, Yang Shi and Yunlong Song" 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (260 commits) perf unwind: Pass symbol source to libunwind tools build: Fix libiberty feature detection perf tools: Compile scriptlets to BPF objects when passing '.c' to --event perf record: Add clang options for compiling BPF scripts perf bpf: Attach eBPF filter to perf event perf tools: Make sure fixdep is built before libbpf perf script: Enable printing of branch stack perf trace: Add cmd string table to decode sys_bpf first arg perf bpf: Collect perf_evsel in BPF object files perf tools: Load eBPF object into kernel perf tools: Create probe points for BPF programs perf tools: Enable passing bpf object file to --event perf ebpf: Add the libbpf glue perf tools: Make perf depend on libbpf perf symbols: Fix endless loop in dso__split_kallsyms_for_kcore perf tools: Enable pre-event inherit setting by config terms perf symbols: we can now read separate debug-info files based on a build ID perf symbols: Fix type error when reading a build-id perf tools: Search for more options when passing args to -h perf stat: Cache aggregated map entries in extra cpumap ...	2015-11-03 17:38:09 -08:00
Linus Torvalds	f5a8160c1e	Merge branch 'core-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI changes from Ingo Molnar: "The main changes in this cycle were: - further EFI code generalization to make it more workable for ARM64 - various extensions, such as 64-bit framebuffer address support, UEFI v2.5 EFI_PROPERTIES_TABLE support - code modularization simplifications and cleanups - new debugging parameters - various fixes and smaller additions" * 'core-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) efi: Fix warning of int-to-pointer-cast on x86 32-bit builds efi: Use correct type for struct efi_memory_map::phys_map x86/efi: Fix kernel panic when CONFIG_DEBUG_VIRTUAL is enabled efi: Add "efi_fake_mem" boot option x86/efi: Rename print_efi_memmap() to efi_print_memmap() efi: Auto-load the efi-pstore module efi: Introduce EFI_NX_PE_DATA bit and set it from properties table efi: Add support for UEFIv2.5 Properties table efi: Add EFI_MEMORY_MORE_RELIABLE support to efi_md_typeattr_format() efifb: Add support for 64-bit frame buffer addresses efi/arm64: Clean up efi_get_fdt_params() interface arm64: Use core efi=debug instead of uefi_debug command line parameter efi/x86: Move efi=debug option parsing to core drivers/firmware: Make efi/esrt.c driver explicitly non-modular efi: Use the generic efi.memmap instead of 'memmap' acpi/apei: Use appropriate pgprot_t to map GHES memory arm64, acpi/apei: Implement arch_apei_get_mem_attributes() arm64/mm: Add PROT_DEVICE_nGnRnE and PROT_NORMAL_WT acpi, x86: Implement arch_apei_get_mem_attributes() efi, x86: Rearrange efi_mem_attributes() ...	2015-11-03 15:05:52 -08:00
Linus Torvalds	7b2a4306f9	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The timer departement provides: - More y2038 work in the area of ntp and pps. - Optimization of posix cpu timers - New time related selftests - Some new clocksource drivers - The usual pile of fixes, cleanups and improvements" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) timeconst: Update path in comment timers/x86/hpet: Type adjustments clocksource/drivers/armada-370-xp: Implement ARM delay timer clocksource/drivers/tango_xtal: Add new timer for Tango SoCs clocksource/drivers/imx: Allow timer irq affinity change clocksource/drivers/exynos_mct: Use container_of() instead of this_cpu_ptr() clocksource/drivers/h8300_*: Remove unneeded memset()s clocksource/drivers/sh_cmt: Remove unneeded memset() in sh_cmt_setup() clocksource/drivers/em_sti: Remove unneeded memset()s clocksource/drivers/mediatek: Use GPT as sched clock source clockevents/drivers/mtk: Fix spurious interrupt leading to crash posix_cpu_timer: Reduce unnecessary sighand lock contention posix_cpu_timer: Convert cputimer->running to bool posix_cpu_timer: Check thread timers only when there are active thread timers posix_cpu_timer: Optimize fastpath_timer_check() timers, kselftest: Add 'adjtick' test to validate adjtimex() tick adjustments timers: Use __fls in apply_slack() clocksource: Remove return statement from void functions net: sfc: avoid using timespec ntp/pps: use y2038 safe types in pps_event_time ...	2015-11-03 14:13:41 -08:00
Wan Zongshun	2167ceabf3	x86/cpu: Add CLZERO detection AMD Fam17h processors introduce support for the CLZERO instruction. It zeroes out the 64 byte cache line specified in RAX. Add the bit here to allow /proc/cpuinfo to list the feature. Boris: we're adding this as a separate ->x86_capability leaf because CPUID_80000008_EBX is going to contain more feature bits and it will fill out with time. Signed-off-by: Wan Zongshun <Vincent.Wan@amd.com> Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> [ Wrap code in patch form, fix comments. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1446207099-24948-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-11-01 11:26:23 +01:00
Borislav Petkov	dc34bdd236	x86/mce: Add a default case to the switch in __mcheck_cpu_ancient_init() Caught by building with W= which enable -Wswitch-default also. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1446207099-24948-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-11-01 11:26:14 +01:00
Aravind Gopalakrishnan	c7f54d21fb	x86/mce: Add a Scalable MCA vendor flags bit Scalable MCA (SMCA) is a new feature in AMD Fam17h processors which indicates presence of MCA extensions. MCA extensions expands existing register space for the MCE banks and also introduces a new MSR range to accommodate new banks. Add the detection bit. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> [ Reformat mce_vendor_flags definitions and save indentation levels. Improve comments. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1446207099-24948-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-11-01 11:26:13 +01:00
Andy Lutomirski	2459ee8651	x86/vm86: Set thread.vm86 to NULL on fork/clone thread.vm86 points to per-task information -- the pointer should not be copied on clone. Fixes: `d4ce0f26c7` ("x86/vm86: Move fields from 'struct kernel_vm86_struct' to 'struct vm86'") Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Stas Sergeev <stsp@list.ru> Link: http://lkml.kernel.org/r/71c5d6985d70ec8197c8d72f003823c81b7dcf99.1446270067.git.luto@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-10-31 09:50:25 +01:00
Werner Pawlitschko	ababae4410	x86/ioapic: Prevent NULL pointer dereference in setup_ioapic_dest() Commit `4857c91f0d` changed the way how irq affinity is setup in setup_ioapic_dest() from using the core helper function to unconditionally calling the irq_set_affinity() callback of the underlying irq chip. That results in a NULL pointer dereference for the rare case where the underlying irq chip is lapic_chip which has no irq_set_affinity() callback. lapic_chip is occasionally used for the timer interrupt (irq 0). The fix is simple: Check the availability of the callback instead of calling it unconditionally. Fixes: `4857c91f0d` "x86/ioapic: Force affinity setting in setup_ioapic_dest()" Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org	2015-10-27 09:18:34 +09:00
Ville Syrjälä	298a96c12b	x86/dma-mapping: Fix arch_dma_alloc_attrs() oops with NULL dev Commit `6894258eda` broke drivers that pass NULL as the device pointer to dma_alloc. The reason is that arch_dma_alloc_attrs() now calls dma_alloc_coherent_gfp_flags() which in turn calls dma_alloc_coherent_mask(), where the device pointer is dereferenced unconditionally. Fix things by moving the ISA DMA fallback device assignment before the call to dma_alloc_coherent_gfp_flags(). Fixes: `6894258eda` ("dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}") Reported-and-tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/1445807503-8920-1-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-10-26 14:59:36 +09:00
Rafael J. Wysocki	343ccb040e	Merge branches 'acpi-scan', 'acpi-tables', 'acpi-ec' and 'acpi-assorted' * acpi-scan: ACPI / scan: use kstrdup_const() in acpi_add_id() ACPI / scan: constify struct acpi_hardware_id::id ACPI / scan: constify first argument of struct acpi_scan_handler::match * acpi-tables: ACPI / tables: test the correct variable x86, ACPI: Handle apic/x2apic entries in MADT in correct order ACPI / tables: Add acpi_subtable_proc to ACPI table parsers * acpi-ec: ACPI / EC: Fix a race issue in acpi_ec_guard_event() ACPI / EC: Fix query handler related issues * acpi-assorted: ACPI: change acpi_sleep_proc_init() to return void ACPI: change init_acpi_device_notify() to return void	2015-10-25 22:54:46 +01:00
Borislav Petkov	c595ac2bac	x86/microcode/intel: Move #ifdef DEBUG inside the function ... and save us the stub. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1445334889-300-6-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:22:12 +02:00
Borislav Petkov	6f7fc44bf1	x86/microcode/amd: Remove maintainers from comments We have the MAINTAINERS file for that. Also, Andreas doesn't have the time for this work anymore. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andreas Herrmann <herrmann.der.user@googlemail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1445334889-300-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:22:12 +02:00
Borislav Petkov	6b26e1bf66	x86/microcode: Remove modularization leftovers Remove the remaining module functionality leftovers. Make "dis_ucode_ldr" an early_param and make it static again. Drop module aliases, autoloading table, description, etc. Bump version number, while at it. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1445334889-300-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:22:12 +02:00
Borislav Petkov	fe055896c0	x86/microcode: Merge the early microcode loader Merge the early loader functionality into the driver proper. The diff is huge but logically, it is simply moving code from the _early.c files into the main driver. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1445334889-300-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:22:12 +02:00
Borislav Petkov	9a2bc335f1	x86/microcode: Unmodularize the microcode driver Make CONFIG_MICROCODE a bool. It was practically a bool already anyway, since early loader was forcing it to =y. Regardless, there's no real reason to have something be a module which gets built-in on the majority of installations out there. And its not like there's noticeable change in functionality - we still can load late microcode - just the module glue disappears. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1445334889-300-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:22:11 +02:00
Jan Beulich	3d45ac4b35	timers/x86/hpet: Type adjustments Standardize on bool instead of an inconsistent mixture of u8 and plain 'int'. Also use u32 or 'unsigned int' instead of 'unsigned long' when a 32-bit type suffices, generating slightly better code on x86-64. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5624E3A002000078000AC49A@prv-mh.provo.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:17:32 +02:00
Andi Kleen	81ffdcdd97	x86/mce: Fix thermal throttling reporting after kexec The per CPU thermal vector init code checks if the thermal vector is already installed and complains and bails out if it is. This happens after kexec, as kernel shut down does not clear the thermal vector APIC register. This causes two problems: 1. So we always do not fully initialize thermal reports after kexec. The CPU is still likely initialized, as the previous kernel should have done it. But we don't set up the software pointer to the thermal vector, so reporting may end up with a unknown thermal interrupt message. 2. Also it complains for every logical CPU, even though the value is actually derived from BP only. The problem is that we end up with one message per CPU, so on larger systems it becomes very noisy and messes up the otherwise nicely formatted CPU bootup numbers in the kernel log. Just remove the check. I checked the code and there's no valid code paths where the thermal init code for a CPU could be called multiple times. Why the kernel does not clean up this value on shutdown: The thermal monitoring is controlled per logical CPU thread. Normal shutdown code is just running on one CPU. To disable it we would need a broadcast NMI to all CPUs on shut down. That's overkill for this. So we just ignore it after kexec. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1445246268-26285-9-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:10:57 +02:00
Borislav Petkov	6f3760570e	x86/setup/crash: Check memblock_reserve() retval memblock_reserve() can fail but the crashkernel reservation code doesn't check that and this can lead the user into believing that the crashkernel region was actually reserved. Make sure we check that return value and we exit early with a failure message in the error case. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Young <dyoung@redhat.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: WANG Chao <chaowang@redhat.com> Cc: jerry_hoemann@hp.com Link: http://lkml.kernel.org/r/1445246268-26285-7-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:10:56 +02:00
Borislav Petkov	f56d55781c	x86/setup/crash: Cleanup some more * Remove unused auto_set variable * Cleanup local function variable declarations * Reformat printk string and use pr_info() No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Young <dyoung@redhat.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: WANG Chao <chaowang@redhat.com> Cc: jerry_hoemann@hp.com Link: http://lkml.kernel.org/r/1445246268-26285-6-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:10:56 +02:00
Borislav Petkov	606134f77c	x86/setup/crash: Remove alignment variable Use a macro instead. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Young <dyoung@redhat.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: WANG Chao <chaowang@redhat.com> Cc: jerry_hoemann@hp.com Link: http://lkml.kernel.org/r/1445246268-26285-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:10:56 +02:00
Borislav Petkov	97eac21bab	x86/setup: Cleanup crashkernel reservation functions * Shorten variable names * Realign code, space out for better readability No code changed: # arch/x86/kernel/setup.o: text data bss dec hex filename 4543 3096 69904 77543 12ee7 setup.o.before 4543 3096 69904 77543 12ee7 setup.o.after md5: 8a1b7c6738a553ca207b56bd84a8f359 setup.o.before.asm 8a1b7c6738a553ca207b56bd84a8f359 setup.o.after.asm Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Young <dyoung@redhat.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: WANG Chao <chaowang@redhat.com> Cc: jerry_hoemann@hp.com Link: http://lkml.kernel.org/r/1445246268-26285-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:10:56 +02:00
Baoquan He	eb6db83d10	x86/setup: Do not reserve crashkernel high memory if low reservation failed People reported that when allocating crashkernel memory using the ",high" and ",low" syntax, there were cases where the reservation of the high portion succeeds but the reservation of the low portion fails. Then kexec can load the kdump kernel successfully, but booting the kdump kernel fails as there's no low memory. The low memory allocation for the kdump kernel can fail on large systems for a couple of reasons. For example, the manually specified crashkernel low memory can be too large and thus no adequate memblock region would be found. Therefore, we try to reserve low memory for the crash kernel after the high memory portion has been allocated. If that fails, we free crashkernel high memory too and return. The user can then take measures accordingly. Tested-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Baoquan He <bhe@redhat.com> [ Massage text. ] Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Joerg Roedel <jroedel@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Young <dyoung@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: WANG Chao <chaowang@redhat.com> Cc: jerry_hoemann@hp.com Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/1445246268-26285-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-21 11:10:55 +02:00
Andrey Ryabinin	f7d27c35dd	x86/mm, kasan: Silence KASAN warnings in get_wchan() get_wchan() is racy by design, it may access volatile stack of running task, thus it may access redzone in a stack frame and cause KASAN to warn about this. Use READ_ONCE_NOCHECK() to silence these warnings. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kostya Serebryany <kcc@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de> Cc: kasan-dev <kasan-dev@googlegroups.com> Link: http://lkml.kernel.org/r/1445243838-17763-3-git-send-email-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-20 11:04:19 +02:00
Stephane Eranian	d892819faa	perf/x86: Add support for PERF_SAMPLE_BRANCH_CALL This patch enables the suport for the PERF_SAMPLE_BRANCH_CALL for Intel x86 processors. When the processor support LBR filtering this the selection is done in hardware. Otherwise, the filter is applied by software. Note that we chose to include zero length calls because they also represent calls. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: khandual@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1444720151-10275-3-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-20 10:30:53 +02:00
Adrian Hunter	b9511cd761	perf/x86: Fix time_shift in perf_event_mmap_page Commit: `b20112edea` ("perf/x86: Improve accuracy of perf/sched clock") allowed the time_shift value in perf_event_mmap_page to be as much as 32. Unfortunately the documented algorithms for using time_shift have it shifting an integer, whereas to work correctly with the value 32, the type must be u64. In the case of perf tools, Intel PT decodes correctly but the timestamps that are output (for example by perf script) have lost 32-bits of granularity so they look like they are not changing at all. Fix by limiting the shift to 31 and adjusting the multiplier accordingly. Also update the documentation of perf_event_mmap_page so that new code based on it will be more future-proof. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: `b20112edea` ("perf/x86: Improve accuracy of perf/sched clock") Link: http://lkml.kernel.org/r/1445001845-13688-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-20 10:30:52 +02:00
Chuck Ebbert	558a65bc31	sched/x86: Fix typo in __switch_to() comments Fix obvious mistake: FS/GS should be DS/ES. Signed-off-by: Chuck Ebbert <cebbert.lkml@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20151014143119.78858eeb@r5 Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-19 10:18:53 +02:00
Len Brown	fcafddec4e	x86/smpboot: Fix CPU #1 boot timeout The following commit: `a9bcaa02a5` ("x86/smpboot: Remove SIPI delays from cpu_up()") Caused some Intel Core2 processors to time-out when bringing up CPU #1, resulting in the missing of that CPU after bootup. That patch reduced the SIPI delays from udelay() 300, 200 to udelay() 0, 0 on modern processors. Several Intel(R) Core(TM)2 systems failed to bring up CPU #1 10/10 times after that change. Increasing either of the SIPI delays to udelay(1) results in success. So here we increase both to udelay(10). While this may be 20x slower than the absolute minimum, it is still 20x to 30x faster than the original code. Tested-by: Donald Parsons <dparsons@brightdsl.net> Tested-by: Shane <shrybman@teksavvy.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dparsons@brightdsl.net Cc: shrybman@teksavvy.com Link: http://lkml.kernel.org/r/6dd554ee8945984d85aafb2ad35793174d068af0.1444968087.git.len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-19 09:14:41 +02:00
Len Brown	f1ccd24931	x86/smpboot: Fix cpu_init_udelay=10000 corner case boot parameter misbehavior For legacy machines cpu_init_udelay defaults to 10,000. For modern machines it is set to 0. The user should be able to set cpu_init_udelay to any value on the cmdline, including 10,000. Before this patch, that was seen as "unchanged from default" and thus on a modern machine, the user request was ignored and the delay was set to 0. Signed-off-by: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dparsons@brightdsl.net Cc: shrybman@teksavvy.com Link: http://lkml.kernel.org/r/de363cdbbcfcca1d22569683f7eb9873e0177251.1444968087.git.len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-19 09:14:41 +02:00
Vitaly Kuznetsov	c0ff971ef9	x86/ioapic: Disable interrupts when re-routing legacy IRQs A sporadic hang with consequent crash is observed when booting Hyper-V Gen1 guests: Call Trace: <IRQ> [<ffffffff810ab68d>] ? trace_hardirqs_off+0xd/0x10 [<ffffffff8107b616>] queue_work_on+0x46/0x90 [<ffffffff81365696>] ? add_interrupt_randomness+0x176/0x1d0 ... <EOI> [<ffffffff81471ddb>] ? _raw_spin_unlock_irqrestore+0x3b/0x60 [<ffffffff810c295e>] __irq_put_desc_unlock+0x1e/0x40 [<ffffffff810c5c35>] irq_modify_status+0xb5/0xd0 [<ffffffff8104adbb>] mp_register_handler+0x4b/0x70 [<ffffffff8104c55a>] mp_irqdomain_alloc+0x1ea/0x2a0 [<ffffffff810c7f10>] irq_domain_alloc_irqs_recursive+0x40/0xa0 [<ffffffff810c860c>] __irq_domain_alloc_irqs+0x13c/0x2b0 [<ffffffff8104b070>] alloc_isa_irq_from_domain.isra.1+0xc0/0xe0 [<ffffffff8104bfa5>] mp_map_pin_to_irq+0x165/0x2d0 [<ffffffff8104c157>] pin_2_irq+0x47/0x80 [<ffffffff81744253>] setup_IO_APIC+0xfe/0x802 ... [<ffffffff814631c0>] ? rest_init+0x140/0x140 The issue is easily reproducible with a simple instrumentation: if mdelay(10) is put between mp_setup_entry() and mp_register_handler() calls in mp_irqdomain_alloc() Hyper-V guest always fails to boot when re-routing IRQ0. The issue seems to be caused by the fact that we don't disable interrupts while doing IOPIC programming for legacy IRQs and IRQ0 actually happens. Protect the setup sequence against concurrent interrupts. [ tglx: Make the protection unconditional and not only for legacy interrupts ] Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: K. Y. Srinivasan <kys@microsoft.com> Link: http://lkml.kernel.org/r/1444930943-19336-1-git-send-email-vkuznets@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-10-16 16:31:24 +02:00
Paolo Bonzini	f5f3497cad	x86/setup: Extend low identity map to cover whole kernel range On 32-bit systems, the initial_page_table is reused by efi_call_phys_prolog as an identity map to call SetVirtualAddressMap. efi_call_phys_prolog takes care of converting the current CPU's GDT to a physical address too. For PAE kernels the identity mapping is achieved by aliasing the first PDPE for the kernel memory mapping into the first PDPE of initial_page_table. This makes the EFI stub's trick "just work". However, for non-PAE kernels there is no guarantee that the identity mapping in the initial_page_table extends as far as the GDT; in this case, accesses to the GDT will cause a page fault (which quickly becomes a triple fault). Fix this by copying the kernel mappings from swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at identity mapping. For some reason, this is only reproducible with QEMU's dynamic translation mode, and not for example with KVM. However, even under KVM one can clearly see that the page table is bogus: $ qemu-system-i386 -pflash OVMF.fd -M q35 vmlinuz0 -s -S -daemonize $ gdb (gdb) target remote localhost:1234 (gdb) hb 0x02858f6f Hardware assisted breakpoint 1 at 0x2858f6f (gdb) c Continuing. Breakpoint 1, 0x02858f6f in ?? () (gdb) monitor info registers ... GDT= 0724e000 000000ff IDT= fffbb000 000007ff CR0=0005003b CR2=ff896000 CR3=032b7000 CR4=00000690 ... The page directory is sane: (gdb) x/4wx 0x32b7000 0x32b7000: 0x03398063 0x03399063 0x0339a063 0x0339b063 (gdb) x/4wx 0x3398000 0x3398000: 0x00000163 0x00001163 0x00002163 0x00003163 (gdb) x/4wx 0x3399000 0x3399000: 0x00400003 0x00401003 0x00402003 0x00403003 but our particular page directory entry is empty: (gdb) x/1wx 0x32b7000 + (0x724e000 >> 22) 4 0x32b7070: 0x00000000 [ It appears that you can skate past this issue if you don't receive any interrupts while the bogus GDT pointer is loaded, or if you avoid reloading the segment registers in general. Andy Lutomirski provides some additional insight: "AFAICT it's entirely permissible for the GDTR and/or LDT descriptor to point to unmapped memory. Any attempt to use them (segment loads, interrupts, IRET, etc) will try to access that memory as if the access came from CPL 0 and, if the access fails, will generate a valid page fault with CR2 pointing into the GDT or LDT." Up until commit `23a0d4e8fa` ("efi: Disable interrupts around EFI calls, not in the epilog/prolog calls") interrupts were disabled around the prolog and epilog calls, and the functional GDT was re-installed before interrupts were re-enabled. Which explains why no one has hit this issue until now. ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Laszlo Ersek <lersek@redhat.com> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Matt Fleming <matt.fleming@intel.com> [ Updated changelog. ]	2015-10-16 10:52:29 +01:00
Lukasz Anaczkowski	d81056b527	x86, ACPI: Handle apic/x2apic entries in MADT in correct order ACPI specifies the following rules when listing APIC IDs: (1) Boot processor is listed first (2) For multi-threaded processors, BIOS should list the first logical processor of each of the individual multi-threaded processors in MADT before listing any of the second logical processors. (3) APIC IDs < 0xFF should be listed in APIC subtable, APIC IDs >= 0xFF should be listed in X2APIC subtable Because of above, when there's more than 0xFF logical CPUs, BIOS interleaves APIC/X2APIC subtables. Assuming, there's 72 cores, 72 hyper-threads each, 288 CPUs total, listing is like this: APIC (0,4,8, .., 252) X2APIC (258,260,264, .. 284) APIC (1,5,9,...,253) X2APIC (259,261,265,...,285) APIC (2,6,10,...,254) X2APIC (260,262,266,..,286) APIC (3,7,11,...,251) X2APIC (255,261,262,266,..,287) Now, before this patch, due to how ACPI MADT subtables were parsed (BSP then X2APIC then APIC), kernel enumerated CPUs in reverted order (i.e. high APIC IDs were getting low logical IDs, and low APIC IDs were getting high logical IDs). This is wrong for the following reasons: () it's hard to predict how cores and threads are enumerated () when it's hard to predict, s/w threads cannot be properly affinitized causing significant performance impact due to e.g. inproper cache sharing () enumeration is inconsistent with how threads are enumerated on other Intel Xeon processors So, order in which MADT APIC/X2APIC handlers are passed is reverse and both handlers are passed to be called during same MADT table to walk to achieve correct CPU enumeration. In scenario when someone boots kernel with options 'maxcpus=72 nox2apic', in result less cores may be booted, since some of the CPUs the kernel will try to use will have APIC ID >= 0xFF. In such case, one should not pass 'nox2apic'. Disclimer: code parsing MADT APIC/X2APIC has not been touched since 2009, when X2APIC support was initially added. I do not know why MADT parsing code was added in the reversed order in the first place. I guess it didn't matter at that time since nobody cared about cores with APIC IDs >= 0xFF, right? This patch is based on work of "Yinghai Lu <yinghai@kernel.org>" previously published at https://lkml.org/lkml/2013/1/21/563 Here's the explanation why parsing interface needs to be changed and why simpler approach will not work https://lkml.org/lkml/2015/9/7/285 Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> (commit message) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2015-10-15 01:31:06 +02:00
Ingo Molnar	790a2ee242	* Make the EFI System Resource Table (ESRT) driver explicitly non-modular by ripping out the module_* code since Kconfig doesn't allow it to be built as a module anyway - Paul Gortmaker * Make the x86 efi=debug kernel parameter, which enables EFI debug code and output, generic and usable by arm64 - Leif Lindholm * Add support to the x86 EFI boot stub for 64-bit Graphics Output Protocol frame buffer addresses - Matt Fleming * Detect when the UEFI v2.5 EFI_PROPERTIES_TABLE feature is enabled in the firmware and set an efi.flags bit so the kernel knows when it can apply more strict runtime mapping attributes - Ard Biesheuvel * Auto-load the efi-pstore module on EFI systems, just like we currently do for the efivars module - Ben Hutchings * Add "efi_fake_mem" kernel parameter which allows the system's EFI memory map to be updated with additional attributes for specific memory ranges. This is useful for testing the kernel code that handles the EFI_MEMORY_MORE_RELIABLE memmap bit even if your firmware doesn't include support - Taku Izumi -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWG7OwAAoJEC84WcCNIz1VEEEP/0SsdrwJ66B4MfP5YNjqHYWm +OTHR6Ovv2i10kc+NjOV/GN8sWPndnkLfIfJ4EqJ9BoQ9PDEYZilV2aleSQ4DrPm H7uGwBXQkfd76tZKX9pMToK76mkhg6M7M2LR3Suv3OGfOEzuozAOt3Ez37lpksTN 2ByhHr/oGbhu99jC2ki5+k0ySH8PMqDBRxqrPbBzTD+FfB7bM11vAJbSNbSMQ21R ZwX0acZBLqb9J2Vf7tDsW+fCfz0TFo8JHW8jdLRFm/y2dpquzxswkkBpODgA8+VM 0F5UbiUdkaIRug75I6N/OJ8+yLwdzuxm7ul+tbS3JrXGLAlK3850+dP2Pr5zQ2Ce zaYGRUy+tD5xMXqOKgzpu+Ia8XnDRLhOlHabiRd5fG6ZC9nR8E9uK52g79voSN07 pADAJnVB03CGV/HdduDOI4C4UykUKubuArbQVkqWJcecV1Jic/tYI0gjeACmU1VF v8FzXpBUe3U3A0jauOz8PBz8M+k5qky/GbIrnEvXreBtKdt999LN9fykTN7rBOpo dk/6vTR1Jyv3aYc9EXHmRluktI6KmfWCqmRBOIgQveX1VhdRM+1w2LKC0+8co3dF v/DBh19KDyfPI8eOvxKykhn164UeAt03EXqDa46wFGr2nVOm/JiShL/d+QuyYU4G 8xb/rET4JrhCG4gFMUZ7 =1Oee -----END PGP SIGNATURE----- Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into core/efi Pull v4.4 EFI updates from Matt Fleming: - Make the EFI System Resource Table (ESRT) driver explicitly non-modular by ripping out the module_* code since Kconfig doesn't allow it to be built as a module anyway. (Paul Gortmaker) - Make the x86 efi=debug kernel parameter, which enables EFI debug code and output, generic and usable by arm64. (Leif Lindholm) - Add support to the x86 EFI boot stub for 64-bit Graphics Output Protocol frame buffer addresses. (Matt Fleming) - Detect when the UEFI v2.5 EFI_PROPERTIES_TABLE feature is enabled in the firmware and set an efi.flags bit so the kernel knows when it can apply more strict runtime mapping attributes - Ard Biesheuvel - Auto-load the efi-pstore module on EFI systems, just like we currently do for the efivars module. (Ben Hutchings) - Add "efi_fake_mem" kernel parameter which allows the system's EFI memory map to be updated with additional attributes for specific memory ranges. This is useful for testing the kernel code that handles the EFI_MEMORY_MORE_RELIABLE memmap bit even if your firmware doesn't include support. (Taku Izumi) Note: there is a semantic conflict between the following two commits: `8a53554e12` ("x86/efi: Fix multiple GOP device support") `ae2ee627dc` ("efifb: Add support for 64-bit frame buffer addresses") I fixed up the interaction in the merge commit, changing the type of current_fb_base from u32 to u64. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-14 16:51:34 +02:00
Andy Shevchenko	3435dd0809	x86/early_printk: Set __iomem address space for IO There are following warnings on unpatched code: arch/x86/kernel/early_printk.c:198:32: warning: incorrect type in initializer (different address spaces) arch/x86/kernel/early_printk.c:198:32: expected void [noderef] <asn:2>vaddr arch/x86/kernel/early_printk.c:198:32: got unsigned int [usertype] <noident> arch/x86/kernel/early_printk.c:205:32: warning: incorrect type in initializer (different address spaces) arch/x86/kernel/early_printk.c:205:32: expected void [noderef] <asn:2>vaddr arch/x86/kernel/early_printk.c:205:32: got unsigned int [usertype] <noident> Annotate it proper. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: http://lkml.kernel.org/r/1444646837-42615-1-git-send-email-andriy.shevchenko@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-10-13 21:45:56 +02:00
Borislav Petkov	0399f73299	x86/microcode/amd: Do not overwrite final patch levels A certain number of patch levels of applied microcode should not be overwritten by the microcode loader, otherwise bad things will happen. Check those and abort update if the current core has one of those final patch levels applied by the BIOS. 32-bit needs special handling, of course. See https://bugzilla.suse.com/show_bug.cgi?id=913996 for more info. Tested-by: Peter Kirchgeßner <pkirchgessner@t-online.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1444641762-9437-7-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-12 16:15:48 +02:00
Borislav Petkov	2eff73c0a1	x86/microcode/amd: Extract current patch level read to a function Pave the way for checking the current patch level of the microcode in a core. We want to be able to do stuff depending on the patch level - in this case decide whether to update or not. But that will be added in a later patch. Drop unused local var uci assignment, while at it. Integrate a fix for 32-bit and CONFIG_PARAVIRT from Takashi Iwai: Use native_rdmsr() in check_current_patch_level() because with CONFIG_PARAVIRT enabled and on 32-bit, where we run before paging has been enabled, we cannot deref pv_info yet. Or we could, but we'd need to access its physical address. This way of fixing it is simpler. See: https://bugzilla.suse.com/show_bug.cgi?id=943179 for the background. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Takashi Iwai <tiwai@suse.com>: Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/1444641762-9437-6-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-12 16:15:48 +02:00
Taku Izumi	0f96a99dab	efi: Add "efi_fake_mem" boot option This patch introduces new boot option named "efi_fake_mem". By specifying this parameter, you can add arbitrary attribute to specific memory range. This is useful for debugging of Address Range Mirroring feature. For example, if "efi_fake_mem=2G@4G:0x10000,2G@0x10a0000000:0x10000" is specified, the original (firmware provided) EFI memmap will be updated so that the specified memory regions have EFI_MEMORY_MORE_RELIABLE attribute (0x10000): <original> efi: mem36: [Conventional Memory\| \| \| \| \| \| \|WB\|WT\|WC\|UC] range=[0x0000000100000000-0x00000020a0000000) (129536MB) <updated> efi: mem36: [Conventional Memory\| \|MR\| \| \| \| \|WB\|WT\|WC\|UC] range=[0x0000000100000000-0x0000000180000000) (2048MB) efi: mem37: [Conventional Memory\| \| \| \| \| \| \|WB\|WT\|WC\|UC] range=[0x0000000180000000-0x00000010a0000000) (61952MB) efi: mem38: [Conventional Memory\| \|MR\| \| \| \| \|WB\|WT\|WC\|UC] range=[0x00000010a0000000-0x0000001120000000) (2048MB) efi: mem39: [Conventional Memory\| \| \| \| \| \| \|WB\|WT\|WC\|UC] range=[0x0000001120000000-0x00000020a0000000) (63488MB) And you will find that the following message is output: efi: Memory: 4096M/131455M mirrored memory Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2015-10-12 14:20:09 +01:00
Ingo Molnar	cdbcd239e2	Merge branch 'x86/ras' into ras/core, to pick up changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-12 14:52:34 +02:00
Minfei Huang	e9c40d257f	x86/kexec: Remove obsolete 'in_crash_kexec' flag Previously, UV NMI used the 'in_crash_kexec' flag to determine whether we are in a kdump kernel or not: `5edd19af18` ("x86, UV: Make kdump avoid stack dumps") But this flags was removed in the following commit: `9c48f1c629` ("x86, nmi: Wire up NMI handlers to new routines") Since it isn't used any more, remove it. Signed-off-by: Minfei Huang <mnfhuang@gmail.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: cpw@sgi.com Cc: kexec@lists.infradead.org Cc: mhuang@redhat.com Link: http://lkml.kernel.org/r/1444070155-17934-1-git-send-email-mhuang@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-10-12 09:43:11 +02:00

1 2 3 4 5 ...

11654 Commits