When running with CONFIG_PARAVIRT, we may want lots of IRQs even if
there's no IO APIC.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Add clockevent drivers for i386: lapic (local) and PIT/HPET (global). Update
the timer IRQ to call into the PIT/HPET driver's event handler and the
lapic-timer IRQ to call into the lapic clockevent driver. The assignement of
timer functionality is delegated to the core framework code and replaces the
compile and runtime evalution in do_timer_interrupt_hook()
Use the clockevents broadcast support and implement the lapic_broadcast
function for ACPI.
No changes to existing functionality.
[ kdump fix from Vivek Goyal <vgoyal@in.ibm.com> ]
[ fixes based on review feedback from Arjan van de Ven <arjan@infradead.org> ]
Cleanups-from: Adrian Bunk <bunk@stusta.de>
Build-fixes-from: Andrew Morton <akpm@osdl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Create a paravirt.h header for all the critical operations which need to be
replaced with hypervisor calls, and include that instead of defining native
operations, when CONFIG_PARAVIRT.
This patch does the dumbest possible replacement of paravirtualized
instructions: calls through a "paravirt_ops" structure. Currently these are
function implementations of native hardware: hypervisors will override the ops
structure with their own variants.
All the pv-ops functions are declared "fastcall" so that a specific
register-based ABI is used, to make inlining assember easier.
And:
+From: Andy Whitcroft <apw@shadowen.org>
The paravirt ops introduce a 'weak' attribute onto memory_setup().
Code ordering leads to the following warnings on x86:
arch/i386/kernel/setup.c:651: warning: weak declaration of
`memory_setup' after first use results in unspecified behavior
Move memory_setup() to avoid this.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.
The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).
Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.
Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.
I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.
This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
struct pt_regs *old_regs = set_irq_regs(regs);
And put the old one back at the end:
set_irq_regs(old_regs);
Don't pass regs through to generic_handle_irq() or __do_IRQ().
In timer_interrupt(), this sort of change will be necessary:
- update_process_times(user_mode(regs));
- profile_tick(CPU_PROFILING, regs);
+ update_process_times(user_mode(get_irq_regs()));
+ profile_tick(CPU_PROFILING);
I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().
Some notes on the interrupt handling in the drivers:
(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.
(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.
(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.
Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
This patch removes the change in behavior of the irq allocation code when
CONFIG_PCI_MSI is defined. Removing all instances of the assumption that irq
== vector.
create_irq is rewritten to first allocate a free irq and then to assign that
irq a vector.
assign_irq_vector is made static and the AUTO_ASSIGN case which allocates an
vector not bound to an irq is removed.
The ioapic vector methods are removed, and everything now works with irqs.
The definition of NR_IRQS no longer depends on CONFIG_PCI_MSI
[akpm@osdl.org: cleanup]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pass ticks to do_timer() and update_times(), and adjust x86_64 and s390
timer interrupt handler with this change.
Currently update_times() calculates ticks by "jiffies - wall_jiffies", but
callers of do_timer() should know how many ticks to update. Passing ticks
get rid of this redundant calculation. Also there are another redundancy
pointed out by Martin Schwidefsky.
This cleanup make a barrier added by
5aee405c66 needless. So this patch removes
it.
As a bonus, this cleanup make wall_jiffies can be removed easily, since now
wall_jiffies is always synced with jiffies. (This patch does not really
remove wall_jiffies. It would be another cleanup patch)
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Vitezslav Samel <samel@mail.cz> reports that an HP DL380 g4 fails using the
default arch due to the ISA bus having an ID of 32.
It would have worked OK with the generic arch - for some reason the default
arch doesn't support as many busses.
So bump that up to support 256 busses, but leave it at 32 if we're building a
tiny system to save a bit of memory.
Cc: Vitezslav Samel <samel@mail.cz>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
On some i386/x86_64 systems, sending an NMI IPI as a broadcast will
reset the system. This seems to be a BIOS bug which affects machines
where one or more cpus are not under OS control. It occurs on HT
systems with a version of the OS that is not compiled without HT
support. It also occurs when a system is booted with max_cpus=n where
2 <= n < cpus known to the BIOS. The fix is to always send NMI IPI as
a mask instead of as a broadcast.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As part of the i386 conversion to the generic timekeeping infrastructure, this
introduces a new tsc.c file. The code in this file replaces the TSC
initialization, management and access code currently in timer_tsc.c (which
will be removed) that we want to preserve.
The code also introduces the following functionality:
o tsc_khz: like cpu_khz but stores the TSC frequency on systems that do not
change TSC frequency w/ CPU frequency
o check/mark_tsc_unstable: accessor/modifier flag for TSC timekeeping
usability
o minor cleanups to calibration math.
This patch also includes a one line __cpuinitdata fix from Zwane Mwaikambo.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up and refactor i386 sub-architecture setup.
This change moves all the code from the
asm-i386/mach-*/setup_arch_pre/post.h headers, into
arch/i386/mach-*/setup.c. mach-*/setup_arch_pre.h is renamed to
setup_arch.h, and contains only things which should be in header files. It
is purely code-motion; there should be no functional changes at all.
Several functions in arch/i386/kernel/setup.c needed to be made non-static
so that they're visible to the code in mach-*/setup.c. asm-i386/setup.h is
used to hold the prototypes for these functions.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Cc: Martin Bligh <mbligh@google.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Andrey Panin <pazke@donpac.ru>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix up some RTC whitespace and style
Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Reading the CMOS clock on x86 and some other arches currently takes up to one
second because it synchronizes with the CMOS second tick-over. This delay
shows up at boot time as well a resume time.
This is the currently the most substantial boot time delay for machines that
are working towards instant-on capability. Also, a quick back of the envelope
calculation (.5sec * 2M users * 1 boot a day * 10 years) suggests it has cost
Linux users in the neighborhood of a million man-hours.
An earlier thread on this topic is here:
http://groups.google.com/group/linux.kernel/browse_frm/thread/8a24255215ff6151/2aa97e66a977653d?hl=en&lr=&ie=UTF-8&rnum=1&prev=/groups%3Fhl%3Den%26lr%3D%26ie%3DUTF-8%26selm%3D1To2R-2S7-11%40gated-at.bofh.it#2aa97e66a977653d
..from which the consensus seems to be that it's no longer desirable.
In my view, there are basically four cases to consider:
1) networked, need precise walltime: use NTP
2) networked, don't need precise walltime: use NTP anyway
3) not networked, don't need sub-second precision walltime: don't care
4) not networked, need sub-second precision walltime:
get a network or a radio time source because RTC isn't good enough anyway
So this patch series simply removes the synchronization in favor of a simple
seqlock-like approach using the seconds value.
Note that for purposes of timer accuracy on wakeup, this patch will cause us
to fire timers up to one second late. But as the current timer resume code
will already sync once (or more!), it's no worse for short timers.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
>commit 76381fee7e
>Author: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
>Date: Thu Jun 23 00:08:46 2005 -0700
>
> [PATCH] xen: x86_64: use more usermode macro
>
> Make use of the user_mode macro where it's possible. This is useful for Xen
> because it will need only to redefine only the macro to a hypervisor call.
I am of the opinion that the above changeset is incomplete, i.e. it missed
converting some previous uses of user_mode to user_mode_vm. While most of
them could be considered just cosmetical, at least the one in die_nmi
doesn't appear to be.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk>
Cc: Zachary Amsden <zach@vmware.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I noticed that some lowlevel send_IPI_mask helpers had a hotplug/preempt
race whereupon the cpu_online_map was read before disabling preemption;
...
cpumask_t mask = cpu_online_map;
int cpu = get_cpu();
cpu_clear(cpu, mask);
...
But then i realised that there is no need for these lowlevel functions to
be going through all this trouble when all the callers are already made
hotplug/preempt safe.
Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Old code could retry for 10 seconds worst time. Only try it
for one second now.
Suggested by Yinghai Lu
Cc: Yinghai.Lu@amd.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I have a system (Biostar IDEQ210M mini-pc with a VIA chipset) which will
not reboot unless a keyboard is plugged in to it. I have tried all
combinations of the kernel "reboot=x,y" flags to no avail. Rebooting by
any method will leave the system in a wedged state (at the "Restarting
system" message).
I finally tracked the problem down to the machine's refusal to fully reboot
unless the keyboard controller status register had bit 2 set. This is the
"System flag" which when set, indicates successful completion of the
keyboard controller self-test (Basic Assurance Test, BAT).
I suppose that something is trying to protect against sporadic reboots
unless the keyboard controller is in a good state (a keyboard is present),
but I need this machine to be headless.
I found that setting the system flag (via the command byte) before giving
the "pulse reset line" command will allow the reboot to proceed. The patch
is simple, and I think it should be fine for everybody whether they have
this type of machine or not. This affects the "hard" reboot (as done when
the kernel boot flags "reboot=c,h" are used).
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch introduces a startup parameter no_broadcast. When we enable
CONFIG_HOTPLUG_CPU, we dont want to use broadcast shortcut as it has ill
effects on a offline cpu. If we issue broadcast, the IPI is also delivered
to offline cpus, or partially up cpu causing stale IPI's to be handled,
which is a problem and can cause undesirable effects.
Introduces a new startup cmdline option no_ipi_broadcast, that can be
switched at cmdline if necessary.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch is per Andi's request to remove NO_IOAPIC_CHECK from genapic and
use heuristics to prevent unique I/O APIC ID check for systems that don't
need it. The patch disables unique I/O APIC ID check for Xeon-based and
other platforms that don't use serial APIC bus for interrupt delivery.
Andi stated that AMD systems don't need unique IO_APIC_IDs either.
Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!