so could figure out bugs where we get an interrupt, but vector_irq is
not initialized yet.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
so we can merge io_apic_32.c and io_apic_64.c
v2: Use cpu_online_map as target cpus for bigsmp, just like 64-bit is doing.
Also remove some unused TARGET_CPUS macro.
v3: need to check if desc is null in smp_irq_move_cleanup
also migration needs to reset vector too, so copy __target_IO_APIC_irq
from 64bit.
(the duplication will go away once the two files are unified.)
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
but actually irq still needs to be less than NR_IRQS, because
interrupt[NR_IRQS] in entry.S.
need to enable per_cpu vector...
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This has been deprecated for years, the user space irqbalanced utility
works better with numa, has configurable policies, etc...
Signed-off-by: Yinghai Lu <yhlu.kernel@gmai.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
also print out irq no in /proc/interrups and /proc/stat in hex, so could
tell bus/dev/func.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
change names:
irq_desc() ==> irq_desc_alloc
__irq_desc() ==> irq_desc
Also split a few of the uses in lowlevel x86 code.
v2: need to check if desc is null in smp_irq_move_cleanup
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
So we could remove some duplicated calling to irq_desc
v2: make sure irq_desc in init/main.c is not used without generic_hardirqs
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove irq limit checks - nr_irqs is dynamic and we expand anytime.
v2: fix checking about result irq_cfg_without_new, so could use msi again
v3: use irq_desc_without_new to check irq is valid
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are a handful of loops that go from 0 to nr_irqs and use
get_irq_desc() on them. These would allocate all the irq_desc
entries, regardless of the need for them.
Use the smarter for_each_irq_desc() iterator that will only iterate
over the present ones.
v2: make sure arch without GENERIC_HARDIRQS work too
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
based on Eric's patch ...
together mold it with dyn_array for irq_desc, will allcate kstat_irqs for
nr_irq_desc alltogether if needed. -- at that point nr_cpus is known already.
v2: make sure system without generic_hardirqs works they don't have irq_desc
v3: fix merging
v4: [mingo@elte.hu] fix typo
[ mingo@elte.hu ] irq: build fix
fix:
arch/x86/xen/spinlock.c: In function 'xen_spin_lock_slow':
arch/x86/xen/spinlock.c:90: error: 'struct kernel_stat' has no member named 'irqs'
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
preallocate 32 irq_2_pin entries, and use get_one_free_irq_2_pin() to get
one more and link to irq_cfg if needed.
so don't waste one where no irq is enabled.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
preallocate size is 32, and if it is not enough, irq_cfg will more
via alloc_bootmem() or kzalloc(). (depending on how early we are in
system setup)
v2: fix typo about size of init_one_irq_cfg ... should use sizeof(struct irq_cfg)
v3: according to Eric, change get_irq_cfg() to irq_cfg()
v4: squash add irq_cfg_alloc in
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add CONFIG_HAVE_SPARSE_IRQ to for use condensed array.
Get rid of irq_desc[] array assumptions.
Preallocate 32 irq_desc, and irq_desc() will try to get more.
( No change in functionality is expected anywhere, except the odd build
failure where we missed a code site or where a crossing commit itroduces
new irq_desc[] usage. )
v2: according to Eric, change get_irq_desc() to irq_desc()
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Until now, NR_IRQS was derived from black magic defines that had to
be "large enough" to both accomodate NR_CPUS and MAX_NR_IO_APICs.
This resulted in a way too large irq_desc[] array on most x86 systems.
Especially with larger CPU masks, the size of irq_desc can spiral out
of control quickly.
So be smarter about it and use precise allocation instead: determine the
default maximum possible IRQ number from the ACPI MADT. Use a minimum limit
of at least 32 IRQs for broken BIOSes.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix non-APIC UP build:
arch/x86/kernel/built-in.o: In function `setup_arch':
: undefined reference to `pin_map_size'
arch/x86/kernel/built-in.o: In function `setup_arch':
: undefined reference to `first_free_entry'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
also add first_free_entry and pin_map_size, which were NR_IRQS derived
constants.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
so could spare some memory with small alignment in bootmem
also tighten the alignment checking, and make print out less debug info.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
arch/x86/kernel/acpi/boot.c💯 warning: 'acpi_mcfg_64bit_base_addr' defined
but not used
http://bugzilla.kernel.org/show_bug.cgi?id=11743
Signed-off-by: Pavel Vasilyev <linuxoid@tochka.ru>
Signed-off-by: Len Brown <len.brown@intel.com>
arch/x86/kernel/pvclock.c:102:6: warning: symbol 'tsc_khz' shadows an earlier one
include/asm/tsc.h:18:21: originally declared here
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
We're currently facing timing problems in guests that do
calibration under heavy load, and then the load vanishes.
This means we'll have a much lower lpj than we actually should,
and delays end up taking less time than they should, which is a
nasty bug.
Solution is to pass on the lpj value from host to guest, and have it
preset.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM intends to use paravirt code to calibrate khz. Xen
current code will do just fine. So as a first step, factor out
code to pvclock.c.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Most if not all x86 platforms have an RTC device, but sometimes the RTC
is not exposed as a PNP0b00/PNP0b01/PNP0b02 device in PNPBIOS or ACPI:
http://bugzilla.kernel.org/show_bug.cgi?id=11580https://bugzilla.redhat.com/show_bug.cgi?id=451188
It's best if we can discover the RTC via PNP because then we know
which flavor of device it is, where it lives, and which IRQ it uses.
But if we can't, we should register a platform device using the
compiled-in RTC_PORT/RTC_IRQ resource assumptions.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Reported-by: Rik Theys <rik.theys@esat.kuleuven.be>
Reported-by: shr_msn@yahoo.com.tw
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 4c3dc21b136f8cb4b72afee16c3ba7e961656c0b in tip introduced the
5-byte NOP ftrace_test_p6nop:
jmp . + 5
.byte 0x00, 0x00, 0x00
This is not friendly to disassemblers because an odd number of 0x00s
ends in the middle of an instruction boundary. This changes the 0x00s
to 1-byte NOPs (0x90).
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
With latest -tip I get this bug:
[ 49.439988] in_atomic():0, irqs_disabled():1
[ 49.440118] INFO: lockdep is turned off.
[ 49.440118] Pid: 2814, comm: modprobe Tainted: G W 2.6.27-rc7 #4
[ 49.440118] [<c01215e1>] __might_sleep+0xe1/0x120
[ 49.440118] [<c01148ea>] ftrace_modify_code+0x2a/0xd0
[ 49.440118] [<c01148a2>] ? ftrace_test_p6nop+0x0/0xa
[ 49.440118] [<c016e80e>] __ftrace_update_code+0xfe/0x2f0
[ 49.440118] [<c01148a2>] ? ftrace_test_p6nop+0x0/0xa
[ 49.440118] [<c016f190>] ftrace_convert_nops+0x50/0x80
[ 49.440118] [<c016f1d6>] ftrace_init_module+0x16/0x20
[ 49.440118] [<c015498b>] load_module+0x185b/0x1d30
[ 49.440118] [<c01767a0>] ? find_get_page+0x0/0xf0
[ 49.440118] [<c02463c0>] ? sprintf+0x0/0x30
[ 49.440118] [<c034e012>] ? mutex_lock_interruptible_nested+0x1f2/0x350
[ 49.440118] [<c0154eb3>] sys_init_module+0x53/0x1b0
[ 49.440118] [<c0352340>] ? do_page_fault+0x0/0x740
[ 49.440118] [<c0104012>] syscall_call+0x7/0xb
[ 49.440118] =======================
It is because ftrace_modify_code() calls copy_to_user and
copy_from_user.
These functions have been inserted after guessing that there
couldn't be any race condition but copy_[to/from]_user might
sleep and __ftrace_update_code is called with local_irq_saved.
These function have been inserted since this commit:
d5e92e8978fd2574e415dc2792c5eb592978243d:
"ftrace: x86 use copy from user function"
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Could just as easily change the three casts to cast to the correct
type...this patch changes the type of ftrace_nop instead.
Supresses sparse warnings:
arch/x86/kernel/ftrace.c:157:14: warning: incorrect type in assignment (different signedness)
arch/x86/kernel/ftrace.c:157:14: expected long *static [toplevel] ftrace_nop
arch/x86/kernel/ftrace.c:157:14: got unsigned long *<noident>
arch/x86/kernel/ftrace.c:161:14: warning: incorrect type in assignment (different signedness)
arch/x86/kernel/ftrace.c:161:14: expected long *static [toplevel] ftrace_nop
arch/x86/kernel/ftrace.c:161:14: got unsigned long *<noident>
arch/x86/kernel/ftrace.c:165:14: warning: incorrect type in assignment (different signedness)
arch/x86/kernel/ftrace.c:165:14: expected long *static [toplevel] ftrace_nop
arch/x86/kernel/ftrace.c:165:14: got unsigned long *<noident>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The modification of code is performed either by kstop_machine, before
SMP starts, or on module code before the module is executed. There is
no reason to do the modifications from assembly. The copy to and from
user functions are sufficient and produces cleaner and easier to read
code.
Thanks to Benjamin Herrenschmidt for suggesting the idea.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mathieu Desnoyers revealed a bug in the original code. The nop that is
used to relpace the mcount caller can be a two part nop. This runs the
risk where a process can be preempted after executing the first nop, but
before the second part of the nop.
The ftrace code calls kstop_machine to keep multiple CPUs from executing
code that is being modified, but it does not protect against a task preempting
in the middle of a two part nop.
If the above preemption happens and the tracer is enabled, after the
kstop_machine runs, all those nops will be calls to the trace function.
If the preempted process that was preempted between the two nops is executed
again, it will execute half of the call to the trace function, and this
might crash the system.
This patch instead uses what both the latest Intel and AMD spec suggests.
That is the P6_NOP5 sequence of "0x0f 0x1f 0x44 0x00 0x00".
Note, some older CPUs and QEMU might fault on this nop, so this nop
is executed with fault handling first. If it detects a fault, it will then
use the code "0x66 0x66 0x66 0x66 0x90". If that faults, it will then
default to a simple "jmp 1f; .byte 0x00 0x00 0x00; 1:". The jmp is
not optimal but will do if the first two can not be executed.
TODO: Examine the cpuid to determine the nop to use.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86 now sets up the mcount locations through the build and no longer
needs to record the ip when the function is executed. This patch changes
the initial mcount to simply return. There's no need to do any other work.
If the ftrace start up test fails, the original mcount will be what everything
will use, so having this as fast as possible is a good thing.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After "dumpstack: x86: various small unification steps", the
assembler gives the following compile error. The error is in
dumpstack_64.c.
{standard input}: Assembler messages:
{standard input}:720: Error: Incorrect register `%rbx' used with `l' suffix
{standard input}:1340: Error: Incorrect register `%r12' used with `l' suffix
Indeed the suffix in get_bp() was wrong.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove remainder of additional_cpus logic. We now just listen to the
disabled_cpus value like we did for years. disabled_cpus is always >=
0 so no need for an extra check.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
additional_cpus=<x> parameter is dangerous and broken: for example
if we boot additional_cpus=-2 on a stock dual-core system it will
crash the box on bootup.
So reduce the maze of code a bit by removingthe user-configurability
angle.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
num_possible_cpus() can be > 1 when disabled CPUs have been accounted.
Disabled CPUs are not in the cpu_present_map, so we can use
num_present_cpus() as a safe indicator to switch to UP alternatives.
Reported-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- define STACKSLOTS_PER_LINE and use it
- define get_bp macro to hide the %%ebp/%%rbp difference
- i386: check task==NULL in dump_trace, like x86_64
- i386: show_trace(NULL, ...) uses current automatically
- x86_64: use [#%d] for die_counter, like i386
- whitespace and comments
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- make kstack= and early_param
- add oops=panic, setting panic_on_oops
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- x86: Write log_lvl strings if available
- start raw stack dumps on new line
- i386: Remove extra indentation for raw stack dumps
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- i386 and x86_64: always printk the 'data' parameter
- i386: announce stack switch (irq -> normal)
- i386: check if there is a stack switch before announcing it
There is a warning that 'context' might come out corrupt in early
boot. If this is true it should be fixed, not worked around.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- Add "end" parameter to valid_stack_ptr and print_context_stack
- use sizeof(long) as the size of a word on the stack
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- x86_64: use %p to print an address
- make i386-version the same as the above
The result should be the same on x86_64; on i386 the
output only changes if CONFIG_KALLSYMS is turned off,
in which case the address is printed twice.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For some reason die_nmi is still defined in traps.c for
i386, but is found in dumpstack_64.c for x86_64. Move it
to dumpstack_32.c
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
traps_32.c and traps_64.c are now equal. Move one to traps.c,
delete the other one and change the Makefile
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use CONFIG_X86_64/CONFIG_X86_32 to condtionally compile the
parts needed for x86_64 or i386 only.
Runs a small userspace for a number of minimal configurations
and boots the defconfigs.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use task_pid_nr(tsk) instead of tsk->pid in do_general_protection.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is the last user of clear_mem_error, which is defined
only on i386. Expand the inline function and remove it from
include/asm-x86/mach-default/mach_traps.h
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use preempt_conditional_sti/cli in do_int3, like on x86_64.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- rename variable me -> tsk
- get thread and tsk like i386
- expand used_math()
- copy comment
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- set_system_gate on i386 is really set_system_trap_gate
- set_system_gate on x86_64 is really set_system_intr_gate
- ist=0 means no special stack switch is done:
- introduce STACKFAULT_STACK, DOUBLEFAULT_STACK, NMI_STACK,
DEBUG_STACK and MCE_STACK as on x86_64.
- use the _ist variants with XXX_STACK set to zero
- remove set_system_gate
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
traps: x86: correct copy/paste bug: a trap is a GATE_TRAP
Fix copy/paste/forgot-to-edit bug in desc.h.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make the x86_64-version and the i386-version of do_debug
more similar.
- introduce preempt_conditional_sti/cli to i386. The preempt-count
is now elevated during the trap handler, like on x86_64. It
does not run on a separate stack, however.
- replace an open-coded "send_sigtrap"
- copy some comments
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mark the exception handlers with "dotraplinkage" to hide the
calling convention differences between i386 and x86_64.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86_64 does not do the lazy io-bitmap dance. Putting it in
its own function makes i386's do_general_protection look
much more like x86_64's.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Split out math_error from do_coprocessor_error and simd_math_error
from do_simd_coprocessor_error, like on i386. While at it, add the
"error_code" parameter to do_coprocessor_error, do_simd_coprocessor_error
and do_spurious_interrupt_bug.
This does not change the generated code, but brings the declarations in
line with all the other trap handlers.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The dumpstack code is logically quite independent from the
hardware traps. Split it out into its own file.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The dumpstack code is logically quite independent from the
hardware traps. Split it out into its own file.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
virt_addr_valid() calls __pa(), which calls __phys_addr(). With
CONFIG_DEBUG_VIRTUAL=y, __phys_addr() will kill the kernel if the
address *isn't* valid. That's clearly wrong for virt_addr_valid().
We also incorporate the debugging checks into virt_addr_valid().
Signed-off-by: Vegard Nossum <vegardno@ben.ifi.uio.no>
Acked-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86: allow number of additional hotplug CPUs to be set at compile time, V2
The default number of additional CPU IDs for hotplugging is determined
by asking ACPI or mptables how many "disabled" CPUs there are in the
system, but many systems get this wrong so that e.g. a uniprocessor
machine gets an extra CPU allocated and never switches to single CPU
mode.
And sometimes CPU hotplugging is enabled only for suspend/hibernate
anyway, so the additional CPU IDs are not wanted. Allow the number
to be set to zero at compile time.
Also, force the number of extra CPUs to zero if hotplugging is disabled
which allows removing some conditional code.
Tested on uniprocessor x86_64 that ACPI claims has a disabled processor,
with CPU hotplugging configured.
("After" has the number of additional CPUs set to 0)
Before: NR_CPUS: 512, nr_cpu_ids: 2, nr_node_ids 1
After: NR_CPUS: 512, nr_cpu_ids: 1, nr_node_ids 1
[Changed the name of the option and the prompt according to Ingo's
suggestion.]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The flag_is_changeable_p() is used by
has_cpuid_p() which can return different results
in the code sequence below:
if (!have_cpuid_p())
identify_cpu_without_cpuid(c);
/* cyrix could have cpuid enabled via c_identify()*/
if (!have_cpuid_p())
return;
Otherwise, the gcc 3.4.6 optimizes these two calls
into one which make the code not working correctly.
Cyrix cpus have the CPUID instruction enabled before
the second call to the have_cpuid_p() but
it is not detected due to the gcc optimization.
Thus the ARR registers (mtrr like) are not detected
on such a cpu.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently the low-level function to dump user-passed registers on i386 is
called __show_registers() whereas on x86-64 it's called __show_regs(). Unify
the API to simplify porting of kmemcheck to x86-64.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add functions that use the infrastructure added by the x2apic code. These
functions were originally stubbed out since the UV code went into the
tree prior to the x2apic code.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit 4a701737 ("x86: move prefill_possible_map calling early, fix")
is the wrong fix: prefill_possible_map() needs to be available
even when CONFIG_HOTPLUG_CPU is not set. A followon patch will do that.
Fix this correctly by making prefill_possible_map() available even when
CONFIG_HOTPLUG_CPU is not set. The function is needed so that
the number of possible CPUs can be determined.
Tested on uniprocessor machine with CPU hotplug disabled.
From boot log:
Before: NR_CPUS: 512, nr_cpu_ids: 512, nr_node_ids 1
After: NR_CPUS: 512, nr_cpu_ids: 1, nr_node_ids 1
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix the size of the mappings of UV hub registers. Size must
be a function of the maximum node number within the SSI.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch hardcodes which traps should be forwarded to
handle_vm86_trap in do_trap. This allows to remove the
vm86 parameter from the i386-version of do_trap, which
makes the DO_VM86_ERROR and DO_VM86_ERROR_INFO macros
unnecessary.
x86_64 part is whitespace only.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All exceptions are taken via interrupt gates. TRACE_IRQS_OFF
is called just before entering the C code, so the irq state
is known to the irq tracer at this point. No need to call
trace_hardirqs_fixup.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All exceptions are taken via interrupt gates. TRACE_IRQS_OFF
is called just before entering the C code, so the irq state
is known to the irq tracer at this point. No need to call
trace_hardirqs_fixup.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All exceptions are taken via interrupt gates. TRACE_IRQS_OFF
is called just before entering the C code, so the irq state
is known to the irq tracer at this point. No need to call
trace_hardirqs_fixup.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add TRACE_IRQS_OFF just before entering the C code.
All exceptions are taken via interrupt gates. If irq tracing is
enabled, it should be notified as soon as possible. Interrupts
are only (conditionally) re-enabled in C code.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add TRACE_IRQS_OFF just before entering the C code.
All exceptions are taken via interrupt gates. If irq tracing is
enabled, it should be notified as soon as possible. Interrupts
are only (conditionally) re-enabled in C code.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Portions of the ACPI code needs to know if a system is a UV system prior
to genapic initialization. This patch adds a call early_acpi_boot_init()
so that the apic type is discovered earlier.
V2 of the patch adding fixes from Yinghai Lu.
Much cleaner and smaller.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Take it out of time initialization and move it to
cpu detection time.
Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Only test for MCA_bus if support for MCA is compiled in.
Also, for x86_64, write the code inside the conditional
for consistency with i386. It won't bite us, since it'll
probably never select CONFIG_MCA anyway.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Replace "4" in time_32.c code by sizeof(long).
This way, it can work on x86_64 too.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For x86_64, it does not really matter. But makes the
code equal to i386.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Save rbp twice: One is for marking the stack frame, as usual (already
there), and the other, to fill pt_regs properly. This is because bx
comes right before the last saved register in that structure, and not
bp. If the base pointer were in the place bx is today, this would not
be needed.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Coalesce v8086_mode and user_mode into a single
user_mode_vm() test.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of using SEGMENT_IS_KERNEL_CODE, use the
"user_mode" macro, which can play the same role. Delete
the former, since it now lacks any user.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some CPUs have vendor string in the middle of model_id instead of beginning
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We should check if we have ESR register before reading from it.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The remappings in setup.c are all just ordinary memory, so use
early_memremap() rather than early_ioremap().
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The exception handlers in entry_32.S should now all call
TRACE_IRQS_OFF before calling the C code. The calls to
trace_hardirqs_fixup should now be unnecessary. Remove them.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At this point interrupts are off, so let's inform the tracing
code of that fact before calling into C.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At this point interrupts are off, so let's inform the tracing
code of that fact before calling into C.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
At this point interrupts are off, so let's inform the tracing
code of that fact before calling into C.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Many exceptions use the same code path via the label 'error_code'
in entry_32.S. At this point interrupts are off, so let's inform
the tracing code of that fact before calling into C.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Only one use of the DO_TRAP macros remains. Expand that one and
remove the macros now.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Handle general protection exception with interrupt initially off.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Handle segment not present exception with interrupt initially off.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Handle no coprocessor exception with interrupt initially off.
device_not_available in entry_32.S calls either math_state_restore
or math_emulate. This patch adds an extra indirection to be
able to re-enable interrupts explicitly in traps_32.c
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The int3 exception was already takes as an interrupt and
do_int3 does not fit in the new DO_ERROR macro. This patch
just expands the DO_TRAP macro and rearranges the code a
bit.
No functional changes intended.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There is some macro magic in traps_32.c to construct standard
exception dispatch functions. This patch renames the DO_ERROR-
like macros to DO_TRAP, and introduces new DO_ERROR ones that
conditionally reenable interrupts explicitly, like x86_64.
No code changes.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86_64 uses a helper function conditional_sti in traps_64.c which
is equal to restore_interrupts in kprobes.h. The only user of
restore_interrupts is in traps_32.c. Introduce conditional_sti
for i386 and remove restore_interrupts.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Secondary cpus start with local interrupts disabled.
start_secondary() first initializes the new cpu, then it enables the
local interrupts. (although interrupts are enabled within smp_callin()
as well).
Right now, the local interrupts are enabled as a side effect of calling
ipi_call_lock_irq().
The attached patch clarifies when local interrupts are enabled.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
propagate an error in enabling the PCI device.
Also eliminates this warning:
arch/x86/kernel/amd_iommu_init.c: In function ‘init_iommu_one’:
arch/x86/kernel/amd_iommu_init.c:726: warning: ignoring return value of ‘pci_enable_device’, declared with attribute warn_unused_result
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/x86/kernel/io_apic_64.c: In function ‘print_local_APIC’:
arch/x86/kernel/io_apic_64.c:1284: warning: format ‘%08x’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
arch/x86/kernel/io_apic_64.c:1285: warning: format ‘%08x’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’
We want to print the two halves of 'icr' at 32 bit width.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix warning:
arch/x86/kernel/early_printk.c:993: warning: ‘enable_debug_console’ defined but not used
Eliminate dead code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix warning:
arch/x86/kernel/xsave.c: In function ‘save_i387_xstate’:
arch/x86/kernel/xsave.c:98: warning: ignoring return value of ‘__clear_user’, declared with attribute warn_unused_result
check the return value and act on it. We should not be ignoring faults
at this point.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This adds a user_regset type for the x86 io permissions bitmap.
This makes it appear in core dumps (when ioperm has been used).
It will also make it visible to debuggers in the future.
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
[conflict resolutions: Signed-off-by: Ingo Molnar <mingo@elte.hu> ]
Do not crash when enumerating supported CPU architectures
SECURITY_INIT somehow ended up in x86_cpu_dev.init section. That caused printk
in code which prints supported architectures to hit #GP due to non-canonical
address being used.
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Cc: thomas.petazzoni@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It's possible for get_wchan() to dereference past task->stack + THREAD_SIZE
while iterating through instruction pointers if fp equals the upper boundary,
causing a kernel panic.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-v28-for-linus-phase4-D' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (186 commits)
x86, debug: print more information about unknown CPUs
x86 setup: handle more than 8 CPU flag words
x86: cpuid, fix typo
x86: move transmeta cap read to early_init_transmeta()
x86: identify_cpu_without_cpuid v2
x86: extended "flags" to show virtualization HW feature in /proc/cpuinfo
x86: move VMX MSRs to msr-index.h
x86: centaur_64.c remove duplicated setting of CONSTANT_TSC
x86: intel.c put workaround for old cpus together
x86: let intel 64-bit use intel.c
x86: make intel_64.c the same as intel.c
x86: make intel.c have 64-bit support code
x86: little clean up of intel.c/intel_64.c
x86: make 64 bit to use amd.c
x86: make amd_64 have 32 bit code
x86: make amd.c have 64bit support code
x86: merge header in amd_64.c
x86: add srat_detect_node for amd64
x86: remove duplicated force_mwait
x86: cpu make amd.c more like amd_64.c v2
...
* 'x86-v28-for-linus-phase3-B' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits)
AMD IOMMU: use iommu_device_max_index, fix
AMD IOMMU: use iommu_device_max_index
x86: add PCI IDs for AMD Barcelona PCI devices
x86/iommu: use __GFP_ZERO instead of memset for GART
x86/iommu: convert GART need_flush to bool
x86/iommu: make GART driver checkpatch clean
x86 gart: remove unnecessary initialization
x86: restore old GART alloc_coherent behavior
revert "x86: make GART to respect device's dma_mask about virtual mappings"
x86: export pci-nommu's alloc_coherent
iommu: remove fullflush and nofullflush in IOMMU generic option
x86: remove set_bit_string()
iommu: export iommu_area_reserve helper function
AMD IOMMU: use coherent_dma_mask in alloc_coherent
add AMD IOMMU tree to MAINTAINERS file
AMD IOMMU: use cmd_buf_size when freeing the command buffer
AMD IOMMU: calculate IVHD size with a function
AMD IOMMU: remove unnecessary cast to u64 in the init code
AMD IOMMU: free domain bitmap with its allocation order
AMD IOMMU: simplify dma_mask_to_pages
...
* 'x86-v28-for-linus-phase2-B' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
x86, cpa: make the kernel physical mapping initialization a two pass sequence, fix
x86, pat: cleanups
x86: fix pagetable init 64-bit breakage
x86: track memtype for RAM in page struct
x86, cpa: srlz cpa(), global flush tlb after splitting big page and before doing cpa
x86, cpa: remove cpa pool code
x86, cpa: no need to check alias for __set_pages_p/__set_pages_np
x86, cpa: dont use large pages for kernel identity mapping with DEBUG_PAGEALLOC
x86, cpa: make the kernel physical mapping initialization a two pass sequence
x86, cpa: remove USER permission from the very early identity mapping attribute
x86, cpa: rename PTE attribute macros for kernel direct mapping in early boot
x86: make sure the CPA test code's use of _PAGE_UNUSED1 is obvious
linux-next: fix x86 tree build failure
x86: have set_memory_array_{uc,wb} coalesce memtypes, fix
agp: enable optimized agp_alloc_pages methods
x86: have set_memory_array_{uc,wb} coalesce memtypes.
x86: {reverve,free}_memtype() take a physical address
x86: fix pageattr-test
agp: add agp_generic_destroy_pages()
agp: generic_alloc_pages()
...
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Fix BUG: using smp_processor_id() in preemptible code
[CPUFREQ] Don't export governors for default governor
[CPUFREQ][6/6] cpufreq: Add idle microaccounting in ondemand governor
[CPUFREQ][5/6] cpufreq: Changes to get_cpu_idle_time_us(), used by ondemand governor
[CPUFREQ][4/6] cpufreq_ondemand: Parameterize down differential
[CPUFREQ][3/6] cpufreq: get_cpu_idle_time() changes in ondemand for idle-microaccounting
[CPUFREQ][2/6] cpufreq: Change load calculation in ondemand for software coordination
[CPUFREQ][1/6] cpufreq: Add cpu number parameter to __cpufreq_driver_getavg()
[CPUFREQ] use deferrable delayed work init in conservative governor
[CPUFREQ] drivers/cpufreq/cpufreq.c: Adjust error handling code involving cpufreq_cpu_put
[CPUFREQ] add error handling for cpufreq_register_governor() error
[CPUFREQ] acpi-cpufreq: add error handling for cpufreq_register_driver() error
[CPUFREQ] Coding style fixes to arch/x86/kernel/cpu/cpufreq/powernow-k6.c
[CPUFREQ] Coding style fixes to arch/x86/kernel/cpu/cpufreq/elanfreq.c
x86_64 SMP suspend to RAM uses a 10k temporary stack for saving the
kernel state, but only 4k of it is used. Shrink it to 4k.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Len Brown <len.brown@intel.com>
x86_64 SMP suspend to RAM uses a 10k temporary stack for saving the
kernel state, but only 4k of it is used. Shrink it to 4k.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'sched-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (38 commits)
sched debug: add name to sched_domain sysctl entries
sched: sync wakeups vs avg_overlap
sched: remove redundant code in cpu_cgroup_create()
sched_rt.c: resch needed in rt_rq_enqueue() for the root rt_rq
cpusets: scan_for_empty_cpusets(), cpuset doesn't seem to be so const
sched: minor optimizations in wake_affine and select_task_rq_fair
sched: maintain only task entities in cfs_rq->tasks list
sched: fixup buddy selection
sched: more sanity checks on the bandwidth settings
sched: add some comments to the bandwidth code
sched: fixlet for group load balance
sched: rework wakeup preemption
CFS scheduler: documentation about scheduling policies
sched: clarify ifdef tangle
sched: fix list traversal to use _rcu variant
sched: turn off WAKEUP_OVERLAP
sched: wakeup preempt when small overlap
kernel/cpu.c: create a CPU_STARTING cpu_chain notifier
kernel/cpu.c: Move the CPU_DYING notifiers
sched: fix __load_balance_iterator() for cfq with only one task
...
This merges phase 1 of the x86 tree, which is a collection of branches:
x86/alternatives, x86/cleanups, x86/commandline, x86/crashdump,
x86/debug, x86/defconfig, x86/doc, x86/exports, x86/fpu, x86/gart,
x86/idle, x86/mm, x86/mtrr, x86/nmi-watchdog, x86/oprofile,
x86/paravirt, x86/reboot, x86/sparse-fixes, x86/tsc, x86/urgent and
x86/vmalloc
and as Ingo says: "these are the easiest, purely independent x86 topics
with no conflicts, in one nice Octopus merge".
* 'x86-v28-for-linus-phase1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (147 commits)
x86: mtrr_cleanup: treat WRPROT as UNCACHEABLE
x86: mtrr_cleanup: first 1M may be covered in var mtrrs
x86: mtrr_cleanup: print out correct type v2
x86: trivial printk fix in efi.c
x86, debug: mtrr_cleanup print out var mtrr before change it
x86: mtrr_cleanup try gran_size to less than 1M, v3
x86: mtrr_cleanup try gran_size to less than 1M, cleanup
x86: change MTRR_SANITIZER to def_bool y
x86, debug printouts: IOMMU setup failures should not be KERN_ERR
x86: export set_memory_ro and set_memory_rw
x86: mtrr_cleanup try gran_size to less than 1M
x86: mtrr_cleanup prepare to make gran_size to less 1M
x86: mtrr_cleanup safe to get more spare regs now
x86_64: be less annoying on boot, v2
x86: mtrr_cleanup hole size should be less than half of chunk_size, v2
x86: add mtrr_cleanup_debug command line
x86: mtrr_cleanup optimization, v2
x86: don't need to go to chunksize to 4G
x86_64: be less annoying on boot
x86, olpc: fix endian bug in openfirmware workaround
...
Write the name of the unknown vendor_id to output instead of just
"unknown".
Tag changed to 'vendor_id' as used in /proc/cpuinfo
Signed-off-by: Hans Schou <linux@schou.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a cpu parameter to __cpufreq_driver_getavg(). This is needed for software
cpufreq coordination where policy->cpu may not be same as the CPU on which we
want to getavg frequency.
A follow-on patch will use this parameter to getavg freq from all cpus
in policy->cpus.
Change since last patch. Fix the offline/online and suspend/resume
oops reported by Youquan Song <youquan.song@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
add error handling for cpufreq_register_driver() error
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: cpufreq@lists.linux.org.uk
Signed-off-by: Dave Jones <davej@redhat.com>
Replace the no longer working links and email address in the
documentation and in source code.
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Dave Jones <davej@redhat.com>
If a processor implementation discern that a processor state component is in
its initialized state, it may modify the corresponding bit in the
xsave header.xstate_bv as '0'. State in the memory layout setup by 'xsave'
will be consistent with the bit values in the header.
During signal handling, legacy applications may change the FP/SSE bits
in the sigcontext memory layout without touching the FP/SSE header bits
in the xsave header. So always set FP/SSE bits in the xsave header
while saving the sigcontext state to the user space. During signal return,
this will enable the kernel to capture any changes to the FP/SSE bits by the
legacy applications which don't touch xsave headers.
xsave aware apps can change the xstate_bv in the xsave header aswell
as change any contents in the memory layout. xrestor as part of sigreturn
will capture all the changes.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This PCI ID based quick should be a full solution for the IRQ0 override
related slowdown problem on SB450 based systems:
33fb0e4: x86: SB450: skip IRQ0 override if it is not routed to INT2 of IOAPIC
Emit a warning in those cases where the DMI quirk triggers but
the PCI ID based quirk didnt.
If this warning does not trigger then we can phase out the DMI quirks.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On some HP nx6... laptops (e.g. nx6325) BIOS reports an IRQ0 override
but the SB450 chipset is configured such that timer interrupts goe to
INT0 of IOAPIC.
Check IRQ0 routing and if it is routed to INT0 of IOAPIC skip the
timer override.
[ This more generic PCI ID based quirk should alleviate the need for
dmi_ignore_irq0_timer_override DMI quirks. ]
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: "Maciej W. Rozycki" <macro@linux-mips.org>
Tested-by: Dmitry Torokhov <dtor@mail.ru>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: gart iommu have direct mapping when agp is present too
Stress-testing KVM's latest NMI support with kgdbts inside an SMP guest,
I came across spurious unhandled NMIs while running the singlestep test.
Looking closer at the code path each NMI takes when KGDB is enabled, I
noticed that kgdb_nmicallback is called twice per event: One time via
DIE_NMI_IPI notification, the second time on DIE_NMI. Removing the first
invocation cures the unhandled NMIs here.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
There is a bug in the BIOSes of some HP boxes with AMD Turions which
connects IO-APIC pins with ACPI thermal trip points in such a way that
if the state of the IO-APIC is not as expected by the (buggy) BIOS, the
thermal trip points are set to insanely low values (usually all of them
become 16 degrees Celsius). As a result, thermal throttling kicks in
and knock the system down to its shoes.
Unfortunately some of the recent IO-APIC changes made the bug show up.
To prevent this from happening, blacklist machines that are known to be
affected (nx6115 and 6715b in this particular case).
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=11516 listed as
a regression from 2.6.26.
On my box it was caused by:
commit 691874fa96
Author: Maciej W. Rozycki <macro@linux-mips.org>
Date: Tue May 27 21:19:51 2008 +0100
x86: I/O APIC: timer through 8259A second-chance
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
and the whole story is described in this (huge) thread:
http://marc.info/?l=linux-kernel&m=121358440508410&w=4
Matthew Garrett told us about that happening on the nx6125:
http://marc.info/?l=linux-kernel&m=121396307411930&w=4
and then Maciej analysed the breakage on the basis of a DSDT from the
nx6325:
http://marc.info/?l=linux-kernel&m=121401068718826&w=4
As far as the Dmitry's and Jason's boxes are concerned, I recognized the
symptoms and asked them to verify that the blacklisting helped.
It appears that the buggy BIOS code has been copy-pasted to the entire
range of machines, for no good reason.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Jason Vas Dias <jason.vas.dias@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
move init_memory_mapping() out of init_k8_gatt.
for: http://bugzilla.kernel.org/show_bug.cgi?id=11676
2.6.27-rc2 to rc8, apgart fails, iommu=soft works, regression
This is needed because we need to map the GART aperture even
if the GATT is not initialized.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For the purpose of MTRR canonicalization, treat WRPROT as UNCACHEABLE.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The first 1M is don't care when it comes to the variables MTRRs.
Cover it as WB as a heuristic approximation; this is generally what we
want to minimize the number of registers.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Print out the correct type when the Write Protected (WP) type is seen.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
[patch] x86: Trivial printk fix in efi.c
The following line is lacking a space between "memdesc" and "doesn't".
"Kernel-defined memdescdoesn't match the one from EFI!"
Fixed the printk by adding a space.
Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
remove braces and indent for flags and fpstate in restore_sigcontext().
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This one took a long time to rear up because LDT usage is not very
common, but the bug is quite serious. It got introduced along with
another bug, already fixed, by 75b8bb3e56
After investigating a JRE failure, I found this bug was introduced a long time
ago, and had already managed to survive another bugfix which occurred on the
same line. The result is a total failure of the JRE due to LDT selectors not
working properly.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Glauber de Oliveira Costa <gcosta@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After investigating a JRE failure, I found this bug was introduced a
long time ago, and had already managed to survive another bugfix which
occurred on the same line. The result is a total failure of the JRE due
to LDT selectors not working properly.
This one took a long time to rear up because LDT usage is not very
common, but the bug is quite serious. It got introduced along with
another bug, already fixed, by 75b8bb3e56
Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Glauber de Oliveira Costa <gcosta@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The number of BIOSes that have an option to enable the IOMMU, or fix
anything about its configuration, is vanishingly small. There's no good
reason to punish quiet boot for this.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Delay exit to make sure we can actually get the optimal result in as
many cases as possible.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
v2: should check with half of range0 size instead of chunk_size
So don't have silly big hole.
in hpa's case we could auto detect instead of adding mtrr_chunk_size in
command line.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>