The commit f4780ca005 moves
swiotlb initialization before dma32_free_bootmem(). It's
supposed to fix a bug that the commit
75f1cdf1dd introduced, we
initialize SWIOTLB right after dma32_free_bootmem so we wrongly
steal memory area allocated for GART with broken BIOS earlier.
However, the above commit introduced another problem, which
likely breaks machines with huge amount of memory. Such a box
use the majority of DMA32_ZONE so there is no memory for
swiotlb.
With this patch, the x86 IOMMU initialization sequence are:
1. We set swiotlb to 1 in the case of (max_pfn > MAX_DMA32_PFN
&& !no_iommu). If swiotlb usage is forced by the boot option,
we go to the step 3 and finish (we don't try to detect IOMMUs).
2. We call the detection functions of all the IOMMUs. The
detection function sets x86_init.iommu.iommu_init to the IOMMU
initialization function (so we can avoid calling the
initialization functions of all the IOMMUs needlessly).
3. We initialize swiotlb (and set dma_ops to swiotlb_dma_ops) if
swiotlb is set to 1.
4. If the IOMMU initialization function doesn't need swiotlb
(e.g. the initialization is sucessful) then sets swiotlb to zero.
5. If we find that swiotlb is set to zero, we free swiotlb
resource.
Reported-by: Yinghai Lu <yinghai@kernel.org>
Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <20091215204729A.fujita.tomonori@lab.ntt.co.jp>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This adds a new category of symbols to the relocs program: symbols
which are known to be relative, even though the linker emits them as
absolute; this is the case for symbols that live in the linker script,
which currently applies to _end.
Unfortunately the previous workaround of putting _end in its own empty
section was defeated by newer binutils, which remove empty sections
completely.
This patch also changes the symbol matching to use regular expressions
instead of hardcoded C for specific patterns.
This is a decidedly non-minimal patch: a modified version of the
relocs program is used as part of the Syslinux build, and this is
basically a backport to Linux of some of those changes; they have
thus been well tested.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4AF86211.3070103@zytor.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
The MSR driver would compute the values for cpu and c at declaration,
and then again in the body of the function. This isn't merely
redundant, but unsafe, since cpu might not refer to a valid CPU at
that point.
Remove the unnecessary and dangerous references in the declarations.
This code now matches the equivalent code in the CPUID driver.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It looks better to have a common function. No change in functionality.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <4B25FDDC.407@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Add check if APIC is not disabled since thermal
monitoring depends on it. As only apic gets disabled
we should not try to install "thermal monitor" vector,
print out that thermal monitoring is enabled and etc...
Note that "Intel Correct Machine Check Interrupts" already
has such a check.
Also I decided to not add cpu_has_apic check into
mcheck_intel_therm_init since even if it'll call apic_read on
disabled apic -- it's safe here and allow us to save a few code
bytes.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
LKML-Reference: <4B25FDC2.3020401@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This fixes the following breakage of the commit
75f1cdf1dd:
- GART systems that don't AGP with broken BIOS and more than 4GB
memory are forced to use swiotlb. They can allocate aperture by
hand and use GART.
- GART systems without GAP must disable GART on shutdown.
- swiotlb usage is forced by the boot option,
gart_iommu_hole_init() is not called, so we disable GART
early_gart_iommu_check().
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <1260759135-6450-3-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The commit 75f1cdf1dd introduced a
bug that we initialize SWIOTLB right after dma32_free_bootmem so
we wrongly steal memory area allocated for GART with broken BIOS
earlier.
This moves swiotlb initialization before dma32_free_bootmem().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: yinghai@kernel.org
LKML-Reference: <1260759135-6450-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When there are a large number of processors in a system, there
is an excessive amount of messages sent to the system console.
It's estimated that with 4096 processors in a system, and the
console baudrate set to 56K, the startup messages will take
about 84 minutes to clear the serial port.
This set of patches limits the number of repetitious messages
which contain no additional information. Much of this information
is obtainable from the /proc and /sysfs. Some of the messages
are also sent to the kernel log buffer as KERN_DEBUG messages so
dmesg can be used to examine more closely any details specific to
a problem.
The new cpu bootup sequence for system_state == SYSTEM_BOOTING:
Booting Node 0, Processors #1#2#3#4#5#6#7 Ok.
Booting Node 1, Processors #8#9#10#11#12#13#14#15 Ok.
...
Booting Node 3, Processors #56#57#58#59#60#61#62#63 Ok.
Brought up 64 CPUs
After the system is running, a single line boot message is displayed
when CPU's are hotplugged on:
Booting Node %d Processor %d APIC 0x%x
Status of the following lines:
CPU: Physical Processor ID: printed once (for boot cpu)
CPU: Processor Core ID: printed once (for boot cpu)
CPU: Hyper-Threading is disabled printed once (for boot cpu)
CPU: Thermal monitoring enabled printed once (for boot cpu)
CPU %d/0x%x -> Node %d: removed
CPU %d is now offline: only if system_state == RUNNING
Initializing CPU#%d: KERN_DEBUG
Signed-off-by: Mike Travis <travis@sgi.com>
LKML-Reference: <4B219E28.8080601@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Print only once that the system is supporting x2apic mode.
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <4B226E92.5080904@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Jens found the following crash/regression:
[ 0.000000] found SMP MP-table at [ffff8800000fdd80] fdd80
[ 0.000000] Kernel panic - not syncing: Overlapping early reservations 12-f011 MP-table mpc to 0-fff BIOS data page
and
[ 0.000000] Kernel panic - not syncing: Overlapping early reservations 12-f011 MP-table mpc to 6000-7fff TRAMPOLINE
and bisected it to b24c2a9 ("x86: Move find_smp_config()
earlier and avoid bootmem usage").
It turns out the BIOS is using the first 64k for mptable,
without reserving it.
So try to find good range for the real-mode trampoline instead of
hard coding it, in case some bios tries to use that range for sth.
Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
LKML-Reference: <4B21630A.6000308@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The per_cpu cpuid4_info shared_map can contain stale data when CPUs are added
and removed.
The stale data can lead to a NULL pointer derefernce panic on a remove of a
CPU that has had siblings previously removed.
This patch resolves the panic by verifying a cpu is actually online before
adding it to the shared_cpu_map, only examining cpus that are part of
the same lower level cache, and by updating other siblings lowest level cache
maps when a cpu is added.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
LKML-Reference: <20091209183336.17855.98708.sendpatchset@prarit.bos.redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-6-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-5-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The arg should be in %eax, but that is clobbered by the return value
of clone. The function pointer can be in any register. Also, don't
push args onto the stack, since regparm(3) is the normal calling
convention now.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-4-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Use user_mode() instead of a magic value for sp to determine when returning
to kernel mode.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-3-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Prepare for merging with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260380084-3707-2-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The device change notifier is initialized in the dma_ops
initialization path. But this path is never executed for
iommu=pt. Move the notifier initialization to IOMMU hardware
init code to fix this.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The data structure changes to use dev->archdata.iommu field
broke the iommu=pt mode because in this case the
dev->archdata.iommu was left uninitialized. This moves the
inititalization of the devices into the main init function
and fixes the problem.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPICA: Update version to 20091112.
ACPICA: Add additional module-level code support
ACPICA: Deploy new create integer interface where appropriate
ACPICA: New internal utility function to create Integer objects
ACPICA: Add repair for predefined methods that must return sorted lists
ACPICA: Fix possible fault if return Package objects contain NULL elements
ACPICA: Add post-order callback to acpi_walk_namespace
ACPICA: Change package length error message to an info message
ACPICA: Reduce severity of predefined repair messages, Warning to Info
ACPICA: Update version to 20091013
ACPICA: Fix possible memory leak for Scope ASL operator
ACPICA: Remove possibility of executing _REG methods twice
ACPICA: Add repair for bad _MAT buffers
ACPICA: Add repair for bad _BIF/_BIX packages
set_iopl_mask() is a no-op on 64 bits, but it is also a paravirt hook,
so call it even on 64 bits.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-3-git-send-email-brgerst@gmail.com>
In the PTREGSCALL1 and 2 macros, we can trivially avoid an unnecessary
pipeline serialization, so do so.
In PTREGSCALLS3 this is much less clear-cut since we have to push a
new value to the stack. Leave it alone for now assuming it is as good
as it is going to be; may want to check on Atom or another in-order
x86 to see if we can do better.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-2-git-send-email-brgerst@gmail.com>
Change 32-bit sys_clone to new PTREGSCALL stub, and merge with 64-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-7-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Convert these to new PTREGSCALL stubs.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-6-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Change 32-bit sys_sigaltstack to PTREGSCALL2, and merge with 64-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-5-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Change 32-bit sys_execve to PTREGSCALL3, and merge with 64-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-4-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Change 32-bit sys_iopl to PTREGSCALL1, and merge with 64-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-3-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add new stubs which add the pt_regs pointer as the last arg, matching
64-bit. This will allow these syscalls to be easily merged.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
LKML-Reference: <1260403316-5679-2-git-send-email-brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Robert Hancock observes that DMI_BOARD_NAME is often more useful
than DMI_PRODUCT_NAME, especially on standalone motherboards.
So, print both.
Signed-off-by: Andy Isaacson <adi@hexapodia.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Robert Hancock <hancockrwd@gmail.com>
Cc: Richard Zidlicky <rz@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20091208083021.GB27174@hexapodia.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Unify x86_32 and x86_64 implementations of __show_regs() header,
standardizing on the x86_64 format string in the process. Also,
32-bit will now call print_modules.
Signed-off-by: Andy Isaacson <adi@hexapodia.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Robert Hancock <hancockrwd@gmail.com>
Cc: Richard Zidlicky <rz@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20091208082942.GA27174@hexapodia.org>
[ v2: resolved conflict ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- Use #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
- Remove "microcode: " prefix from each pr_<level>
- Fix duplicated KERN_ERR prefix
- Coalesce pr_<level> format strings
- Add a space after an exclamation point
No other change in output.
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Andreas Herrmann <herrmann.der.user@googlemail.com>
LKML-Reference: <1260340250.27677.191.camel@Joe-Laptop.home>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
timers, init: Limit the number of per cpu calibration bootup messages
posix-cpu-timers: optimize and document timer_create callback
clockevents: Add missing include to pacify sparse
x86: vmiclock: Fix printk format
x86: Fix printk format due to variable type change
sparc: fix printk for change of variable type
clocksource/events: Fix fallout of generic code changes
nohz: Allow 32-bit machines to sleep for more than 2.15 seconds
nohz: Track last do_timer() cpu
nohz: Prevent clocksource wrapping during idle
nohz: Type cast printk argument
mips: Use generic mult/shift factor calculation for clocks
clocksource: Provide a generic mult/shift factor calculation
clockevents: Use u32 for mult and shift factors
nohz: Introduce arch_needs_cpu
nohz: Reuse ktime in sub-functions of tick_check_idle.
time: Remove xtime_cache
time: Implement logarithmic time accumulation
* 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: hpet: Make WARN_ON understandable
x86: arch specific support for remapping HPET MSIs
intr-remap: generic support for remapping HPET MSIs
x86, hpet: Simplify the HPET code
x86, hpet: Disable per-cpu hpet timer if ARAT is supported
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mce: don't restart timer if disabled
x86: Use -maccumulate-outgoing-args for sane mcount prologues
x86: Prevent GCC 4.4.x (pentium-mmx et al) function prologue wreckage
x86: AMD Northbridge: Verify NB's node is online
x86 VSDO: Fix Kconfig help
x86: Fix typo in Intel CPU cache size descriptor
x86: Add new Intel CPU cache size descriptors
* 'x86-reboot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/reboot: Add pci_dev_put in reboot_fixup_32.c for consistency
* 'x86-process-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-64: merge the standard and compat start_thread() functions
x86-64: make compat_start_thread() match start_thread()
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits)
x86, mm: Correct the implementation of is_untracked_pat_range()
x86/pat: Trivial: don't create debugfs for memtype if pat is disabled
x86, mtrr: Fix sorting of mtrr after subtracting
x86: Move find_smp_config() earlier and avoid bootmem usage
x86, platform: Change is_untracked_pat_range() to bool; cleanup init
x86: Change is_ISA_range() into an inline function
x86, mm: is_untracked_pat_range() takes a normal semiclosed range
x86, mm: Call is_untracked_pat_range() rather than is_ISA_range()
x86: UV SGI: Don't track GRU space in PAT
x86: SGI UV: Fix BAU initialization
x86, numa: Use near(er) online node instead of roundrobin for NUMA
x86, numa, bootmem: Only free bootmem on NUMA failure path
x86: Change crash kernel to reserve via reserve_early()
x86: Eliminate redundant/contradicting cache line size config options
x86: When cleaning MTRRs, do not fold WP into UC
x86: remove "extern" from function prototypes in <asm/proto.h>
x86, mm: Report state of NX protections during boot
x86, mm: Clean up and simplify NX enablement
x86, pageattr: Make set_memory_(x|nx) aware of NX support
x86, sleep: Always save the value of EFER
...
Fix up conflicts (added both iommu_shutdown and is_untracked_pat_range)
to 'struct x86_platform_ops') in
arch/x86/include/asm/x86_init.h
arch/x86/kernel/x86_init.c
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: ucode-amd: Move family check to microcde_amd.c's init function
x86, ucode-amd: Ensure ucode update on suspend/resume after CPU off/online cycle
x86: ucode-amd: Convert printk(KERN_*...) to pr_*(...)
x86: ucode-amd: Don't warn when no ucode is available for a CPU revision
x86: ucode-amd: Load ucode-patches once and not separately of each CPU
x86, amd-ucode: Remove needless log messages
Commit cebe182033 had an unnecessary,
wrong change: &mce_banks[i].attr is equivalent to the former
bank_attrs[i], not to mce_attrs[i].
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <4B1E05CC.4040703@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (84 commits)
KVM: VMX: Fix comparison of guest efer with stale host value
KVM: s390: Fix prefix register checking in arch/s390/kvm/sigp.c
KVM: Drop user return notifier when disabling virtualization on a cpu
KVM: VMX: Disable unrestricted guest when EPT disabled
KVM: x86 emulator: limit instructions to 15 bytes
KVM: s390: Make psw available on all exits, not just a subset
KVM: x86: Add KVM_GET/SET_VCPU_EVENTS
KVM: VMX: Report unexpected simultaneous exceptions as internal errors
KVM: Allow internal errors reported to userspace to carry extra data
KVM: Reorder IOCTLs in main kvm.h
KVM: x86: Polish exception injection via KVM_SET_GUEST_DEBUG
KVM: only clear irq_source_id if irqchip is present
KVM: x86: disallow KVM_{SET,GET}_LAPIC without allocated in-kernel lapic
KVM: x86: disallow multiple KVM_CREATE_IRQCHIP
KVM: VMX: Remove vmx->msr_offset_efer
KVM: MMU: update invlpg handler comment
KVM: VMX: move CR3/PDPTR update to vmx_set_cr3
KVM: remove duplicated task_switch check
KVM: powerpc: Fix BUILD_BUG_ON condition
KVM: VMX: Use shared msr infrastructure
...
Trivial conflicts due to new Kconfig options in arch/Kconfig and kernel/Makefile
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
mac80211: fix reorder buffer release
iwmc3200wifi: Enable wimax core through module parameter
iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
iwmc3200wifi: Coex table command does not expect a response
iwmc3200wifi: Update wiwi priority table
iwlwifi: driver version track kernel version
iwlwifi: indicate uCode type when fail dump error/event log
iwl3945: remove duplicated event logging code
b43: fix two warnings
ipw2100: fix rebooting hang with driver loaded
cfg80211: indent regulatory messages with spaces
iwmc3200wifi: fix NULL pointer dereference in pmkid update
mac80211: Fix TX status reporting for injected data frames
ath9k: enable 2GHz band only if the device supports it
airo: Fix integer overflow warning
rt2x00: Fix padding bug on L2PAD devices.
WE: Fix set events not propagated
b43legacy: avoid PPC fault during resume
b43: avoid PPC fault during resume
tcp: fix a timewait refcnt race
...
Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
CTL_UNNUMBERED removed) in
kernel/sysctl_check.c
net/ipv4/sysctl_net_ipv4.c
net/ipv6/addrconf.c
net/sctp/sysctl.c