linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-18 09:44:18 +08:00

Author	SHA1	Message	Date
Eric W. Biederman	1bc3b91aee	[PATCH] crashdump: x86 crashkernel option This is the x86 implementation of the crashkernel option. It reserves a window of memory very early in the bootup process, so we never use it for anything but the kernel to switch to when the running kernel panics. In addition to reserving this memory a resource structure is registered so looking at /proc/iomem it is clear what happened to that memory. ISSUES: Is it possible to implement this in a architecture generic way? What should be done with architectures that always use an iommu and thus don't report their RAM memory resources in /proc/iomem? Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:50 -07:00
Eric W. Biederman	63d30298ef	[PATCH] kexec: x86 shutdown APICs during crash_shutdown In the case of a crash/panic an architecture specific function machine_crash_shutdown is called. This patch adds to the x86 machine_crash function the standard kernel code for shutting down apics. Every line of code added to that function increases the risk that we will call code after a kernel panic that is not safe. This patch should not make it to the stable kernel without a being reviewed a lot more. It is unclear how much a hardned kernel can take when it comes to misconfigured apics. So since a normal kernel has problems this patch does a clean shutdown. It is my expectation this patch will be dropped from future generations of the kexec work. But for the moment it is a crutch to keep from breaking everything. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:50 -07:00
Eric W. Biederman	2c818b45a2	[PATCH] kexec: x86: snapshot registers during crash shutdown After the kernel panics if we wish to generate an entire machine core file it is very nice to know the register state at the time the machine crashed. After long discussion it was realized that if you are going to be saving the information anyway it is reasonable to store the information in a format that it will be used and recognized in so the register state is stored in the standard ELF note format. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:49 -07:00
Eric W. Biederman	c4ac4263a0	[PATCH] crashdump: x86: add NMI handler to capture other CPUs One of the dangers when switching from one kernel to another is what happens to all of the other cpus that were running in the crashed kernel. In an attempt to avoid that problem this patch adds a nmi handler and attempts to shoot down the other cpus by sending them non maskable interrupts. The code then waits for 1 second or until all known cpus have stopped running and then jumps from the running kernel that has crashed to the kernel in reserved memory. The kernel spin loop is used for the delay as that should behave continue to be safe even in after a crash. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:49 -07:00
Eric W. Biederman	5033cba087	[PATCH] kexec: x86 kexec core This is the i386 implementation of kexec. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:49 -07:00
Eric W. Biederman	dd2a13054f	[PATCH] kexec: x86: factor out apic shutdown code Factor out the apic and smp shutdown code from machine_restart so it can be called by in the kexec reboot path as well. By switching to the bootstrap cpu by default on reboot I can delete/simplify some motherboard fixups well. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:49 -07:00
Vivek Goyal	8a9190853c	[PATCH] kexec: reserve Bootmem fix for booting nondefault location kernel This patch fixes a problem with reserving memory during boot up of a kernel built for non-default location. Currently boot memory allocator reserves the memory required by kernel image, boot allocaotor bitmap etc. It assumes that kernel is loaded at 1MB (HIGH_MEMORY hard coded to 1024*1024). But kernel can be built for non-default locatoin, hence existing hardcoding will lead to reserving unnecessary memory. This patch fixes it. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:48 -07:00
Eric W. Biederman	3d345e3fc9	[PATCH] kexec: x86: add CONFIG_PYSICAL_START For one kernel to report a crash another kernel has created we need to have 2 kernels loaded simultaneously in memory. To accomplish this the two kernels need to built to run at different physical addresses. This patch adds the CONFIG_PHYSICAL_START option to the x86 kernel so we can do just that. You need to know what you are doing and the ramifications are before changing this value, and most users won't care so I have made it depend on CONFIG_EMBEDDED bzImage kernels will work and run at a different address when compiled with this option but they will still load at 1MB. If you need a kernel loaded at a different address as well you need to boot a vmlinux. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:48 -07:00
Eric W. Biederman	ad0d75ebac	[PATCH] kexec: x86: vmlinux: fix physical addresses The vmlinux on i386 does not report the correct physical address of the kernel. Instead in the physical address field it currently reports the virtual address of the kernel. This is patch is a bug fix that corrects vmlinux to report the proper physical addresses. This is potentially a help for crash dump analysis tools. This definitiely allows bootloaders that load vmlinux as a standard ELF executable. Bootloaders directly loading vmlinux become of practical importance when we consider the kexec on panic case. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:47 -07:00
Eric W. Biederman	650927ef8a	[PATCH] kexec: x86: resture apic virtual wire mode on shutdown When coming out of apic mode attempt to set the appropriate apic back into virtual wire mode. This improves on previous versions of this patch by by never setting bot the local apic and the ioapic into veritual wire mode. This code looks at data from the mptable to see if an ioapic has an ExtInt input to make this decision. A future improvement is to figure out which apic or ioapic was in virtual wire mode at boot time and to remember it. That is potentially a more accurate method, of selecting which apic to place in virutal wire mode. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:47 -07:00
Eric W. Biederman	cee5dab485	[PATCH] kexec: x86: i8259 shutdown: disable interrupts From: Eric W. Biederman <ebiederm@xmission.com> This patch disables interrupt generation from the legacy pic on reboot. Now that there is a sys_device class it should not be called while drivers are still using interrupts. There is a report about this breaking ACPI power off on some systems. http://bugme.osdl.org/show_bug.cgi?id=4041 However the final comment seems to exonerate this code. So until I get more information I believe that was a false positive. Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:46 -07:00
Eric W. Biederman	9635b47d91	[PATCH] kexec: x86: local apic fix From: "Maciej W. Rozycki" <macro@linux-mips.org> Fix a kexec problem whcih causes local APIC detection failure. The problem is detect_init_APIC() is called early, before the command line have been processed. Therefore "lapic" (and "nolapic") have not been seen, yet. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:46 -07:00
Ingo Molnar	cc19ca86a0	[PATCH] consolidate PREEMPT options into kernel/Kconfig.preempt This patch consolidates the CONFIG_PREEMPT and CONFIG_PREEMPT_BKL preemption options into kernel/Kconfig.preempt. This, besides reducing source-code, also enables more centralized tweaking of preemption related options. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:45 -07:00
Pavel Machek	8d783b3e02	[PATCH] swsusp: clean assembly parts This patch fixes register saving so that each register is only saved once, and adds missing saving of %cr8 on x86-64. Some reordering so that save/restore is more logical/safer (segment registers should be restored after gdt). Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:33 -07:00
Pavel Machek	648be31881	[PATCH] swsusp: kill config_pm_disk CONFIG_PM_DISK is long gone, but it still managed to survived at few places. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:32 -07:00
Li Shaohua	5a72e04df5	[PATCH] suspend/resume SMP support Using CPU hotplug to support suspend/resume SMP. Both S3 and S4 use disable/enable_nonboot_cpus API. The S4 part is based on Pavel's original S4 SMP patch. Signed-off-by: Li Shaohua<shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:32 -07:00
Ashok Raj	76e4f660d9	[PATCH] x86_64: CPU hotplug support Experimental CPU hotplug patch for x86_64 ----------------------------------------- This supports logical CPU online and offline. - Test with maxcpus=1, and then kick other cpu's off to test if init code is all cleaned up. CONFIG_SCHED_SMT works as well. - idle threads are forked on demand from keventd threads for clean startup TBD: 1. Not tested on a real NUMA machine (tested with numa=fake=2) 2. Handle ACPI pieces for physical hotplug support. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Acked-by: Andi Kleen <ak@muc.de> Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Shaohua.li<shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:30 -07:00
Li Shaohua	e1367daf3e	[PATCH] cpu state clean after hot remove Clean CPU states in order to reuse smp boot code for CPU hotplug. Signed-off-by: Li Shaohua<shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:30 -07:00
Li Shaohua	0bb3184df5	[PATCH] init call cleanup Trival patch for CPU hotplug. In CPU identify part, only did cleaup for intel CPUs. Need do for other CPUs if they support S3 SMP. Signed-off-by: Li Shaohua<shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:30 -07:00
Li Shaohua	d720803a93	[PATCH] sibling map initializing rework Make sibling map init per-cpu. Hotplug CPU may change the map at runtime. Signed-off-by: Li Shaohua<shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:29 -07:00
Li Shaohua	6fe940d6c3	[PATCH] sep initializing rework Make SEP init per-cpu, so it is hotplug safe. Signed-off-by: Li Shaohua<shaohua.li@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:29 -07:00
Ashok Raj	67664c8f7e	[PATCH] i386: Dont use IPI broadcast when using cpu hotplug. This patch introduces a startup parameter no_broadcast. When we enable CONFIG_HOTPLUG_CPU, we dont want to use broadcast shortcut as it has ill effects on a offline cpu. If we issue broadcast, the IPI is also delivered to offline cpus, or partially up cpu causing stale IPI's to be handled, which is a problem and can cause undesirable effects. Introduces a new startup cmdline option no_ipi_broadcast, that can be switched at cmdline if necessary. Signed-off-by: Ashok Raj <ashok.raj@intel.com> Acked-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:29 -07:00
Zwane Mwaikambo	f370513640	[PATCH] i386 CPU hotplug (The i386 CPU hotplug patch provides infrastructure for some work which Pavel is doing as well as for ACPI S3 (suspend-to-RAM) work which Li Shaohua <shaohua.li@intel.com> is doing) The following provides i386 architecture support for safely unregistering and registering processors during runtime, updated for the current -mm tree. In order to avoid dumping cpu hotplug code into kernel/irq/* i dropped the cpu_online check in do_IRQ() by modifying fixup_irqs(). The difference being that on cpu offline, fixup_irqs() is called before we clear the cpu from cpu_online_map and a long delay in order to ensure that we never have any queued external interrupts on the APICs. There are additional changes to s390 and ppc64 to account for this change. 1) Add CONFIG_HOTPLUG_CPU 2) disable local APIC timer on dead cpus. 3) Disable preempt around irq balancing to prevent CPUs going down. 4) Print irq stats for all possible cpus. 5) Debugging check for interrupts on offline cpus. 6) Hacky fixup_irqs() to redirect irqs when cpus go off/online. 7) play_dead() for offline cpus to spin inside. 8) Handle offline cpus set in flush_tlb_others(). 9) Grab lock earlier in smp_call_function() to prevent CPUs going down. 10) Implement __cpu_disable() and __cpu_die(). 11) Enable local interrupts in cpu_enable() after fixup_irqs() 12) Don't fiddle with NMI on dead cpu, but leave intact on other cpus. 13) Program IRQ affinity whilst cpu is still in cpu_online_map on offline. Signed-off-by: Zwane Mwaikambo <zwane@linuxpower.ca> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:29 -07:00
Shaohua Li	d92de65cab	[PATCH] variable overflow after hundreds round of hotplug CPU I'm doing the cpu hotplug stress test and found a variable ('ready') is overflow after several hundreds rounds of cpu hotplug. Here is a fix. Signed-off-by: Shaohua Li<shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:28 -07:00
Shaohua Li	a13db56624	[PATCH] CPU hotplug: fix hpet sectioning With hpet enabled, cpu hotplug uses some routines marked with __init. Signed-off-by: Shaohua Li<shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:28 -07:00
Andrey Panin	1249c5138f	[PATCH] dmi: spring cleanup Whitespace and CodingStyle cleanup. No functionality changes. Signed-off-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:28 -07:00
Andrey Panin	b625883f24	[PATCH] dmi: remove central blacklist Since last dmi quirk looks useless (it just prints 404 compliant url) we can finally remove central dmi blacklist. Signed-off-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:28 -07:00
Andrey Panin	0f8133a8db	[PATCH] dmi: move ACPI sleep quirk This patch moves ACPI sleep quirk out of dmi_scan.c Signed-off-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:28 -07:00
Andrey Panin	aea00143a8	[PATCH] dmi: move ACPI boot quirk This patch moves ACPI boot quirks out of dmi_scan.c Signed-off-by: Andrey Panin <pazke@donpac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:27 -07:00
Dmitry Torokhov	e70c9d5e61	[PATCH] I8K: use standard DMI interface I8K: Change to use stock dmi infrastructure instead of homegrown parsing code. The driver now requires box's DMI data to match list of supported models so driver can be safely compiled-in by default without fear of it poking into random SMM BIOS code. DMI checks can be ignored with i8k.ignore_dmi option. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-25 16:24:24 -07:00
Prasanna S Panchamukhi	417c8da651	[PATCH] kprobes: Temporary disarming of reentrant probe for i386 This patch includes i386 architecture specific changes to support temporary disarming on reentrancy of probes. Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:24 -07:00
Hien Nguyen	0aa55e4d7d	[PATCH] kprobes: moves lock-unlock to non-arch kprobe_flush_task This patch moves the lock/unlock of the arch specific kprobe_flush_task() to the non-arch specific kprobe_flusk_task(). Signed-off-by: Hien Nguyen <hien@us.ibm.com> Acked-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:21 -07:00
Rusty Lynch	7e1048b11c	[PATCH] Move kprobe [dis]arming into arch specific code The architecture independent code of the current kprobes implementation is arming and disarming kprobes at registration time. The problem is that the code is assuming that arming and disarming is a just done by a simple write of some magic value to an address. This is problematic for ia64 where our instructions look more like structures, and we can not insert break points by just doing something like: p->addr = BREAKPOINT_INSTRUCTION; The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent functions: void arch_arm_kprobe(struct kprobe p) void arch_disarm_kprobe(struct kprobe *p) and then adds the new functions for each of the architectures that already implement kprobes (spar64/ppc64/i386/x86_64). I thought arch_[dis]arm_kprobe was the most descriptive of what was really happening, but each of the architectures already had a disarm_kprobe() function that was really a "disarm and do some other clean-up items as needed when you stumble across a recursive kprobe." So... I took the liberty of changing the code that was calling disarm_kprobe() to call arch_disarm_kprobe(), and then do the cleanup in the block of code dealing with the recursive kprobe case. So far this patch as been tested on i386, x86_64, and ppc64, but still needs to be tested in sparc64. Signed-off-by: Rusty Lynch <rusty.lynch@intel.com> Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:21 -07:00
Hien Nguyen	b94cce926b	[PATCH] kprobes: function-return probes This patch adds function-return probes to kprobes for the i386 architecture. This enables you to establish a handler to be run when a function returns. 1. API Two new functions are added to kprobes: int register_kretprobe(struct kretprobe rp); void unregister_kretprobe(struct kretprobe rp); 2. Registration and unregistration 2.1 Register To register a function-return probe, the user populates the following fields in a kretprobe object and calls register_kretprobe() with the kretprobe address as an argument: kp.addr - the function's address handler - this function is run after the ret instruction executes, but before control returns to the return address in the caller. maxactive - The maximum number of instances of the probed function that can be active concurrently. For example, if the function is non- recursive and is called with a spinlock or mutex held, maxactive = 1 should be enough. If the function is non-recursive and can never relinquish the CPU (e.g., via a semaphore or preemption), NR_CPUS should be enough. maxactive is used to determine how many kretprobe_instance objects to allocate for this particular probed function. If maxactive <= 0, it is set to a default value (if CONFIG_PREEMPT maxactive=max(10, 2 * NR_CPUS) else maxactive=NR_CPUS) For example: struct kretprobe rp; rp.kp.addr = /* entrypoint address / rp.handler = /return probe handler / rp.maxactive = / e.g., 1 or NR_CPUS or 0, see the above explanation */ register_kretprobe(&rp); The following field may also be of interest: nmissed - Initialized to zero when the function-return probe is registered, and incremented every time the probed function is entered but there is no kretprobe_instance object available for establishing the function-return probe (i.e., because maxactive was set too low). 2.2 Unregister To unregiter a function-return probe, the user calls unregister_kretprobe() with the same kretprobe object as registered previously. If a probed function is running when the return probe is unregistered, the function will return as expected, but the handler won't be run. 3. Limitations 3.1 This patch supports only the i386 architecture, but patches for x86_64 and ppc64 are anticipated soon. 3.2 Return probes operates by replacing the return address in the stack (or in a known register, such as the lr register for ppc). This may cause __builtin_return_address(0), when invoked from the return-probed function, to return the address of the return-probes trampoline. 3.3 This implementation uses the "Multiprobes at an address" feature in 2.6.12-rc3-mm3. 3.4 Due to a limitation in multi-probes, you cannot currently establish a return probe and a jprobe on the same function. A patch to remove this limitation is being tested. This feature is required by SystemTap (http://sourceware.org/systemtap), and reflects ideas contributed by several SystemTap developers, including Will Cohen and Ananth Mavinakayanahalli. Signed-off-by: Hien Nguyen <hien@us.ibm.com> Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Frederik Deweerdt <frederik.deweerdt@laposte.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:21 -07:00
Vincent Hanquez	717b594a41	[PATCH] xen: x86: Use more usermode macro Use the user_mode macro where it's possible. Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:14 -07:00
Vincent Hanquez	fa1e1bdf78	[PATCH] xen: x86: Rename usermode macro Rename user_mode to user_mode_vm and add a user_mode macro similar to the x86-64 one. This is useful for Xen because the linux xen kernel does not runs on the same priviledge that a vanilla linux kernel, and with this we just need to redefine user_mode(). Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:14 -07:00
Vincent Hanquez	1cc6f12e03	[PATCH] xen: x86: Use new macro for debugreg Make use of the 2 new macro set_debugreg and get_debugreg. Signed-off-by: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> Cc: Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:13 -07:00
Andrew Morton	c92c6ffdb1	[PATCH] mtrr size-and-base debugging Consolidate the mtrr sanity checking, add a dump_stack(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:12 -07:00
Andrew Morton	a3a255e744	[PATCH] x86: cpu_khz type fix x86_64's cpu_khz is unsigned int and there is no reason why x86 needs to use unsigned long. So make cpu_khz unsigned int on x86 as well. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:11 -07:00
Alexey Dobriyan	129f69465b	[PATCH] Remove i386_ksyms.c, almost. * EXPORT_SYMBOL's moved to other files * #include <linux/config.h>, <linux/module.h> where needed * #include's in i386_ksyms.c cleaned up * After copy-paste, redundant due to Makefiles rules preprocessor directives removed: #ifdef CONFIG_FOO EXPORT_SYMBOL(foo); #endif obj-$(CONFIG_FOO) += foo.o * Tiny reformat to fit in 80 columns Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:11 -07:00
Aleksey Gorelov	80bb82afea	[PATCH] VIA 82C586B IRQ routing fix According to the VIA 82C586B datasheet (still available from http://gkernel.sourceforge.net/specs/via/586b.pdf.bz2) this chip need a special PIRQ mapping. Signed-off-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Aleksey Gorelov <aleksey_gorelov@phoenix.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:11 -07:00
Natalie Protasevich	c434b7a6ae	[PATCH] x86: avoid wasting IRQs for PCI devices I have submitted the patch for x86_64, this is submission for i386. The patch changes the way IRQs are handed out to PCI devices. Currently, each I/O APIC pin gets associated with an IRQ, no matter if the pin is used or not. This imposes severe limitation on systems that have designs that employ many I/O APICs, only utilizing couple lines of each, such as P64H2 chipset. It is used in ES7000, and currently, there is no way to boot the system with more that 9 I/O APICs. The simple change below allows to boot a system with say 64 (or more) I/O APICs, each providing 1 slot, which otherwise impossible because of the IRQ gaps created for unused lines on each I/O APIC. It does not resolve the problem with number of devices that exceeds number of possible IRQs, but eases up a tension for IRQs on any large system with potentually large number of devices. Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:10 -07:00
Christoph Lameter	5912100372	[PATCH] i386: Selectable Frequency of the Timer Interrupt Make the timer frequency selectable. The timer interrupt may cause bus and memory contention in large NUMA systems since the interrupt occurs on each processor HZ times per second. Signed-off-by: Christoph Lameter <christoph@lameter.com> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:10 -07:00
Jan Beulich	7fbb4f6e68	[PATCH] adjust i386 watchdog tick calculation Get the i386 watchdog tick calculation into a state where it can also be used on CPUs with frequencies beyond 4GHz, and it consolidates the calculation into a single place (for potential furture adjustments). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:09 -07:00
Natalie Protasevich	ca05fea6db	[PATCH] Do not enforce unique IO_APIC_ID check for xAPIC systems (i386) This patch is per Andi's request to remove NO_IOAPIC_CHECK from genapic and use heuristics to prevent unique I/O APIC ID check for systems that don't need it. The patch disables unique I/O APIC ID check for Xeon-based and other platforms that don't use serial APIC bus for interrupt delivery. Andi stated that AMD systems don't need unique IO_APIC_IDs either. Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:09 -07:00
Roland McGrath	7c1def1652	[PATCH] i386: never block forced SIGSEGV This problem was first noticed on PPC and has already been fixed there. But the exact same issue applies to other platforms in the same way. The signal blocking for sa_mask and the handled signal takes place after the handler setup. When the stack is bogus, the handler setup forces a SIGSEGV. But then this will be blocked, and returning to user mode will fault again and iterate. This patch fixes the problem by checking whether signal handler setup failed, and not doing the signal-blocking if so. This copies what was done in the ppc code. I think all architectures' signal handler setup code follows this pattern and needs the change. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:09 -07:00
Venkatesh Pallipadi	8a9e1b0f56	[PATCH] Platform SMIs and their interferance with tsc based delay calibration Issue: Current tsc based delay_calibration can result in significant errors in loops_per_jiffy count when the platform events like SMIs (System Management Interrupts that are non-maskable) are present. This could lead to potential kernel panic(). This issue is becoming more visible with 2.6 kernel (as default HZ is 1000) and on platforms with higher SMI handling latencies. During the boot time, SMIs are mostly used by BIOS (for things like legacy keyboard emulation). Description: The psuedocode for current delay calibration with tsc based delay looks like (0) Estimate a value for loops_per_jiffy (1) While (loops_per_jiffy estimate is accurate enough) (2) wait for jiffy transition (jiffy1) (3) Note down current tsc (tsc1) (4) loop until tsc becomes tsc1 + loops_per_jiffy (5) check whether jiffy changed since jiffy1 or not and refine loops_per_jiffy estimate Consider the following cases Case 1: If SMIs happen between (2) and (3) above, we can end up with a loops_per_jiffy value that is too low. This results in shorted delays and kernel can panic () during boot (Mostly at IOAPIC timer initialization timer_irq_works() as we don't have enough timer interrupts in a specified interval). Case 2: If SMIs happen between (3) and (4) above, then we can end up with a loops_per_jiffy value that is too high. And with current i386 code, too high lpj value (greater than 17M) can result in a overflow in delay.c:__const_udelay() again resulting in shorter delay and panic(). Solution: The patch below makes the calibration routine aware of asynchronous events like SMIs. We increase the delay calibration time and also identify any significant errors (greater than 12.5%) in the calibration and notify it to user. Patch below changes both i386 and x86-64 architectures to use this new and improved calibrate_delay_direct() routine. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:08 -07:00
Ian Campbell	0f8e2d62fa	[PATCH] use ${CROSS_COMPILE}installkernel in arch/*/boot/install.sh The attached patch causes the various arch specific install.sh scripts to look for ${CROSS_COMPILE}installkernel rather than just installkernel (in both /sbin/ and ~/bin/ where the script already did this). This allows you to have e.g. arm-linux-installkernel as a handy way to install on your cross target. It also prevents the script picking up on the host /sbin/installkernel which causes the script to fall through and do the install itself (which is what I actually use myself, with $INSTALL_PATH set). I don't believe it causes back-compatibility problems since calling the host installkernel was never likely to work or be what you wanted when cross compiling anyway. If $CROSS_COMPILE isn't set then nothing changes. I only use ARM and i386 myself but I figured it couldn't hurt to do the whole lot. I've cc'd those who I hope are the arch maintainers for files that I've touched. Signed-off-by: Ian Campbell <icampbell@arcom.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:07 -07:00
H. Peter Anvin	d0e7feb03d	[PATCH] biarch compiler support for i386 This allows the i386 architecture to be built on a system with a biarch compiler that defaults to x86-64, merely by specifying ARCH=i386. As previously discussed, this uses the equivalent logic to the ppc port. Signed-Off-By: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:07 -07:00
Martin J. Bligh	6f4e1e5061	[PATCH] add page_state info to show_mem This helps a lot when debugging out of memory stuff - useful especially to see if all the memory is sucked into slab, etc. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:07 -07:00
Andy Whitcroft	05b79bdcb4	[PATCH] sparsemem memory model for i386 Provide the architecture specific implementation for SPARSEMEM for i386 SMP and NUMA systems. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Martin Bligh <mbligh@aracnet.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:05 -07:00
Andy Whitcroft	d41dee369b	[PATCH] sparsemem memory model Sparsemem abstracts the use of discontiguous mem_maps[]. This kind of mem_map[] is needed by discontiguous memory machines (like in the old CONFIG_DISCONTIGMEM case) as well as memory hotplug systems. Sparsemem replaces DISCONTIGMEM when enabled, and it is hoped that it can eventually become a complete replacement. A significant advantage over DISCONTIGMEM is that it's completely separated from CONFIG_NUMA. When producing this patch, it became apparent in that NUMA and DISCONTIG are often confused. Another advantage is that sparse doesn't require each NUMA node's ranges to be contiguous. It can handle overlapping ranges between nodes with no problems, where DISCONTIGMEM currently throws away that memory. Sparsemem uses an array to provide different pfn_to_page() translations for each SECTION_SIZE area of physical memory. This is what allows the mem_map[] to be chopped up. In order to do quick pfn_to_page() operations, the section number of the page is encoded in page->flags. Part of the sparsemem infrastructure enables sharing of these bits more dynamically (at compile-time) between the page_zone() and sparsemem operations. However, on 32-bit architectures, the number of bits is quite limited, and may require growing the size of the page->flags type in certain conditions. Several things might force this to occur: a decrease in the SECTION_SIZE (if you want to hotplug smaller areas of memory), an increase in the physical address space, or an increase in the number of used page->flags. One thing to note is that, once sparsemem is present, the NUMA node information no longer needs to be stored in the page->flags. It might provide speed increases on certain platforms and will be stored there if there is room. But, if out of room, an alternate (theoretically slower) mechanism is used. This patch introduces CONFIG_FLATMEM. It is used in almost all cases where there used to be an #ifndef DISCONTIG, because SPARSEMEM and DISCONTIGMEM often have to compile out the same areas of code. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Martin Bligh <mbligh@aracnet.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Bob Picco <bob.picco@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:04 -07:00
Andy Whitcroft	af705362ab	[PATCH] generify memory present Allow architectures to indicate that they will be providing hooks to indice installed memory areas, memory_present(). Provide prototypes for the i386 implementation. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Martin Bligh <mbligh@aracnet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:04 -07:00
Andy Whitcroft	b159d43fbf	[PATCH] generify early_pfn_to_nid Provide a default implementation for early_pfn_to_nid returning node 0. Allow architectures to override this with their own implementation out of asm/mmzone.h. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Martin Bligh <mbligh@aracnet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:04 -07:00
Dave Hansen	3f22ab276b	[PATCH] make each arch use mm/Kconfig For all architectures, this just means that you'll see a "Memory Model" choice in your architecture menu. For those that implement DISCONTIGMEM, you may eventually want to make your ARCH_DISCONTIGMEM_ENABLE a "def_bool y" and make your users select DISCONTIGMEM right out of the new choice menu. The only disadvantage might be if you have some specific things that you need in your help option to explain something about DISCONTIGMEM. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:02 -07:00
Dave Hansen	5b505b90b2	[PATCH] sparsemem base: teach discontig about sparse ranges discontig.c has some assumptions that mem_map[]s inside of a node are contiguous. Teach it to make sure that each region that it's bringing online is actually made up of valid ranges of ram. Written-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:01 -07:00
Dave Hansen	6f167ec721	[PATCH] sparsemem base: simple NUMA remap space allocator Introduce a simple allocator for the NUMA remap space. This space is very scarce, used for structures which are best allocated node local. This mechanism is also used on non-NUMA ia64 systems with a vmem_map to keep the pgdat->node_mem_map initialized in a consistent place for all architectures. Issues: o alloc_remap takes a node_id where we might expect a pgdat which was intended to allow us to allocate the pgdat's using this mechanism; which we do not yet do. Could have alloc_remap_node() and alloc_remap_nid() for this purpose. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:01 -07:00
Dave Hansen	c2ebaa425e	[PATCH] sparsemem base: early_pfn_to_nid() (works before sparse is initialized) The following four patches provide the last needed changes before the introduction of sparsemem. For a more complete description of what this will do, please see this patch: http://www.sr71.net/patches/2.6.11/2.6.11-bk7-mhp1/broken-out/B-sparse-150-sparsemem.patch or previous posts on the subject: http://marc.theaimsgroup.com/?t=110868540700001&r=1&w=2 http://marc.theaimsgroup.com/?l=linux-mm&m=109897373315016&w=2 Three of these are i386-only, but one of them reorganizes the macros used to manage the space in page->flags, and will affect all platforms. There are analogous patches to the i386 ones for ppc64, ia64, and x86_64, but those will be submitted by the normal arch maintainers. The combination of the four patches has been test-booted on a variety of i386 hardware, and compiled for ppc64, i386, and x86-64 with about 17 different .configs. It's also been runtime-tested on ia64 configs (with more patches on top). This patch: We _know_ which node pages in general belong to, at least at a very gross level in node_{start,end}_pfn[]. Use those to target the allocations of pages. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:00 -07:00
Dave Hansen	408fde81c1	[PATCH] remove non-DISCONTIG use of pgdat->node_mem_map This patch effectively eliminates direct use of pgdat->node_mem_map outside of the DISCONTIG code. On a flat memory system, these fields aren't currently used, neither are they on a sparsemem system. There was also a node_mem_map(nid) macro on many architectures. Its use along with the use of ->node_mem_map itself was not consistent. It has been removed in favor of two new, more explicit, arch-independent macros: pgdat_page_nr(pgdat, pagenr) nid_page_nr(nid, pagenr) I called them "pgdat" and "nid" because we overload the term "node" to mean "NUMA node", "DISCONTIG node" or "pg_data_t" in very confusing ways. I believe the newer names are much clearer. These macros can be overridden in the sparsemem case with a theoretically slower operation using node_start_pfn and pfn_to_page(), instead. We could make this the only behavior if people want, but I don't want to change too much at once. One thing at a time. This patch removes more code than it adds. Compile tested on alpha, alpha discontig, arm, arm-discontig, i386, i386 generic, NUMAQ, Summit, ppc64, ppc64 discontig, and x86_64. Full list here: http://sr71.net/patches/2.6.12/2.6.12-rc1-mhp2/configs/ Boot tested on NUMAQ, x86 SMP and ppc64 power4/5 LPARs. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Martin J. Bligh <mbligh@aracnet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-23 09:45:00 -07:00
Coywolf Qi Hunt	2894801db1	[PATCH] kbuild: display compile version I am always trying to make sure I've booted the right kernel after a new install. Too paranoid maybe. But I guess there're other people like me. So let's make kbuild display the compile version number at the end to give us a hint. I know we may be booting vmlinux someday, but don't care about it for now. Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 18:46:22 -07:00
Badari Pulavarty	cbe37d0937	[PATCH] mm: remove PG_highmem Remove PG_highmem, to save a page flag. Use is_highmem() instead. It'll generate a little more code, but we don't use PageHigheMem() in many places. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 18:46:17 -07:00
Wolfgang Wander	1363c3cd86	[PATCH] Avoiding mmap fragmentation Ingo recently introduced a great speedup for allocating new mmaps using the free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and causes huge performance increases in thread creation. The downside of this patch is that it does lead to fragmentation in the mmap-ed areas (visible via /proc/self/maps), such that some applications that work fine under 2.4 kernels quickly run out of memory on any 2.6 kernel. The problem is twofold: 1) the free_area_cache is used to continue a search for memory where the last search ended. Before the change new areas were always searched from the base address on. So now new small areas are cluttering holes of all sizes throughout the whole mmap-able region whereas before small holes tended to close holes near the base leaving holes far from the base large and available for larger requests. 2) the free_area_cache also is set to the location of the last munmap-ed area so in scenarios where we allocate e.g. five regions of 1K each, then free regions 4 2 3 in this order the next request for 1K will be placed in the position of the old region 3, whereas before we appended it to the still active region 1, placing it at the location of the old region 2. Before we had 1 free region of 2K, now we only get two free regions of 1K -> fragmentation. The patch addresses thes issues by introducing yet another cache descriptor cached_hole_size that contains the largest known hole size below the current free_area_cache. If a new request comes in the size is compared against the cached_hole_size and if the request can be filled with a hole below free_area_cache the search is started from the base instead. The results look promising: Whereas 2.6.12-rc4 fragments quickly and my (earlier posted) leakme.c test program terminates after 50000+ iterations with 96 distinct and fragmented maps in /proc/self/maps it performs nicely (as expected) with thread creation, Ingo's test_str02 with 20000 threads requires 0.7s system time. Taking out Ingo's patch (un-patch available per request) by basically deleting all mentions of free_area_cache from the kernel and starting the search for new memory always at the respective bases we observe: leakme terminates successfully with 11 distinctive hardly fragmented areas in /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system time for Ingo's test_str02 with 20000 threads. Now - drumroll ;-) the appended patch works fine with leakme: it ends with only 7 distinct areas in /proc/self/maps and also thread creation seems sufficiently fast with 0.71s for 20000 threads. Signed-off-by: Wolfgang Wander <wwc@rentec.com> Credit-to: "Richard Purdie" <rpurdie@rpsys.net> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> (partly) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 18:46:16 -07:00
David Gibson	63551ae0fe	[PATCH] Hugepage consolidation A lot of the code in arch/*/mm/hugetlbpage.c is quite similar. This patch attempts to consolidate a lot of the code across the arch's, putting the combined version in mm/hugetlb.c. There are a couple of uglyish hacks in order to covert all the hugepage archs, but the result is a very large reduction in the total amount of code. It also means things like hugepage lazy allocation could be implemented in one place, instead of six. Tested, at least a little, on ppc64, i386 and x86_64. Notes: - this patch changes the meaning of set_huge_pte() to be more analagous to set_pte() - does SH4 need s special huge_ptep_get_and_clear()?? Acked-by: William Lee Irwin <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 18:46:15 -07:00
Martin Hicks	753ee72896	[PATCH] VM: early zone reclaim This is the core of the (much simplified) early reclaim. The goal of this patch is to reclaim some easily-freed pages from a zone before falling back onto another zone. One of the major uses of this is NUMA machines. With the default allocator behavior the allocator would look for memory in another zone, which might be off-node, before trying to reclaim from the current zone. This adds a zone tuneable to enable early zone reclaim. It is selected on a per-zone basis and is turned on/off via syscall. Adding some extra throttling on the reclaim was also required (patch 4/4). Without the machine would grind to a crawl when doing a "make -j" kernel build. Even with this patch the System Time is higher on average, but it seems tolerable. Here are some numbers for kernbench runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run: wall user sys %cpu ctx sw. sleeps ---- ---- --- ---- ------ ------ No patch 1009 1384 847 258 298170 504402 w/patch, no reclaim 880 1376 667 288 254064 396745 w/patch & reclaim 1079 1385 926 252 291625 548873 These numbers are the average of 2 runs of 3 "make -j" runs done right after system boot. Run-to-run variability for "make -j" is huge, so these numbers aren't terribly useful except to seee that with reclaim the benchmark still finishes in a reasonable amount of time. I also looked at the NUMA hit/miss stats for the "make -j" runs and the reclaim doesn't make any difference when the machine is thrashing away. Doing a "make -j8" on a single node that is filled with page cache pages takes 700 seconds with reclaim turned on and 735 seconds without reclaim (due to remote memory accesses). The simple zone_reclaim syscall program is at http://www.bork.org/~mort/sgi/zone_reclaim.c Signed-off-by: Martin Hicks <mort@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 18:46:14 -07:00
Ingo Molnar	39c715b717	[PATCH] smp_processor_id() cleanup This patch implements a number of smp_processor_id() cleanup ideas that Arjan van de Ven and I came up with. The previous __smp_processor_id/_smp_processor_id/smp_processor_id API spaghetti was hard to follow both on the implementational and on the usage side. Some of the complexity arose from picking wrong names, some of the complexity comes from the fact that not all architectures defined __smp_processor_id. In the new code, there are two externally visible symbols: - smp_processor_id(): debug variant. - raw_smp_processor_id(): nondebug variant. Replaces all existing uses of _smp_processor_id() and __smp_processor_id(). Defined by every SMP architecture in include/asm-*/smp.h. There is one new internal symbol, dependent on DEBUG_PREEMPT: - debug_smp_processor_id(): internal debug variant, mapped to smp_processor_id(). Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new lib/smp_processor_id.c file. All related comments got updated and/or clarified. I have build/boot tested the following 8 .config combinations on x86: {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT} I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other architectures are untested, but should work just fine.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-21 18:46:13 -07:00
gregkh@suse.de	8874b414ff	[PATCH] class: convert arch/* to use the new class api instead of class_simple Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2005-06-20 15:15:09 -07:00
Thomas Hood	92c6dc59b7	[PATCH] apm.c: ignore_normal_resume is set a bit too late This patch causes the ignore_normal_resume flag to be set slightly earlier, before there is a chance that the apm driver will receive the normal resume event from the BIOS. (Addresses Debian bug #310865) Signed-off-by: Thomas Hood <jdthood@yahoo.co.uk> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-14 07:19:35 -07:00
Keith Owens	5754c9b649	[PATCH] Stop arch/i386/kernel/vsyscall-note.o being rebuilt every time arch/i386/kernel/vsyscall-note.o is not listed as a target so its .cmd file is neither considered as a target nor is it read on the next build. This causes vsyscall-note.o to be rebuilt every time that you run make, which causes vmlinux to be rebuilt every time. Signed-off-by: Keith Owens <kaos@ocs.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-08 16:21:13 -07:00
Dave Jones	f94ea640a2	[CPUFREQ] Typos. cpfureq developers cant spel. Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:52 -07:00
Dave Jones	6778bae0f2	[CPUFREQ] longhaul - adjust transition latency. From patch by: Ken Staton <ken_staton@agilent.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:51 -07:00
Dave Jones	1174631418	[CPUFREQ] Longhaul: Magic timer frobbing. As mandated by the spec, disable timer around transitions. From code by : Ken Staton <ken_staton@agilent.com Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:51 -07:00
Dave Jones	3be6a48f3c	[CPUFREQ] longhaul - disable PCI mastering around transition. The spec states that we have to do this, which is horrid. Based on code from: Ken Staton <ken_staton@agilent.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:51 -07:00
Dave Jones	065b807ca1	[CPUFREQ] dual-core powernow-k8 With the release of the dual-core AMD Opterons last week, it's high time that cpufreq supported them. The attached patch applies cleanly to 2.6.12-rc3 and updates powernow-k8 to support the latest Athlon 64 and Opteron processors. Update the driver to version 1.40.0 and provide support for dual-core processors. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:46 -07:00
Dave Jones	c5d28fb297	[CPUFREQ] Recalibrate cpu_khz [2/2] Some cpufreq drivers (at that time, only powernow-k7) need to recalibrate the cpu_khz at runtime. Signed-off-by: Bruno Ducrot <ducrot@poupinou.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:46 -07:00
Dave Jones	91350ed49b	[CPUFREQ] Recalibrate cpu_khz [1/2] We have to recalibrate cpu_khz in order to use the current FID instead the max FID since some BIOS do not put the processor at maximum frequency at POST. Also, some BIOS will change the processor frequency at our back after cpu_khz was calibrate. Finally, this will fix a long standing bug when we do something like this: # rmmod powernow-k7 # modprobe powernow-k7 Signed-off-by: Bruno Ducrot <ducrot@poupinou.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:45 -07:00
Dave Jones	bf6fc9fd2d	[CPUFREQ] AMD Elan SC520 cpufreq driver. From: Sean Young <sean@mess.org> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:45 -07:00
Dave Jones	6f4095af6d	[CPUFREQ] speedstep-smi: it works on at least one P4M The speedstep-smi driver actually works on >=1 notebook with a Pentium 4-M CPU where all other cpufreq drivers fail. Therefore, allow speedstep-smi on P4Ms again, but warn users of likely failure Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:44 -07:00
Dave Jones	8282864a96	[CPUFREQ] speedstep-centrino: Pentium 4 - M (HT) support The Pentium 4 - Ms (HT) with CPUID 0xF34 and 0xF41 seem to support centrino-like enhanced speedstep; however, no "table" support is possible. Therefore, put NULL entries into speedstep-centrino.c Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:43 -07:00
Dave Jones	7eb53d8823	[CPUFREQ] powernow-k7: don't print khz element of FSB. Signed-off-by: Dave Jones <davej@redhat.com>	2005-05-31 19:03:42 -07:00
Alexander Nyberg	adaa765d76	[PATCH] acpi build fix: x86 setup.c This is a neverending story linux/acpi.h contains empty declarations for acpi_boot_init() & acpi_boot_table_init() but they are nested inside #ifdef CONFIG_ACPI. So we'll have to #ifdef in arch/i386/kernel/setup.c: setup_arch() Signed-off-by: Alexander Nyberg <alexn@telia.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-31 14:54:17 -07:00
Siddha, Suresh B	49f384b82b	[PATCH] x86: fix smp_num_siblings on buggy BIOSes This fixes 'smp_num_siblings' value on the systems with a buggy bios, which sets number of siblings to '2' even when HT is disabled. (more details are at http://bugzilla.kernel.org/show_bug.cgi?id=4359) I am planning to do more cleanup in this area (like moving smp_num_siblings to per cpuinfo) shortly. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-28 11:14:00 -07:00
Adrian Bunk	70ffc71c5c	[PATCH] arch/i386/kernel/cpu/intel_cacheinfo.c: section fix num_cache_leaves is used in __devexit cache_remove_dev() and can therefore not be __devinit. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-28 11:14:00 -07:00
Alexander Nyberg	8aadff7dd5	[PATCH] Note on ACPI build fix Even after the previous fix you can still set CONFIG_ACPI_BOOT indirectly even without CONFIG_ACPI by choosing CONFIG_PCI and CONFIG_PCI_MMCONFIG. That doesn't build very well either. This makes PCI_MMCONFIG depend on ACPI, fixing that hole. [ I guess in theory Kconfig could follow the whole chain of dependencies for things that get selected, but that sounds insanely complicated, so we'll just fix up these things by hand. --Linus ] Signed-off-by: Alexander Nyberg <alexn@telia.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-27 08:15:05 -07:00
Len Brown	25be5e6ccc	[PATCH] VIA IRQ quirk Delete quirk_via_bridge(), restore quirk_via_irqpic() -- but now improved to be invoked upon device ENABLE, and now only for VIA devices -- not all devices behind VIA bridges. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-27 08:15:04 -07:00
Dominik Hackl	6431e6a28e	[PATCH] voyager_smp.c static inline fix This patch fixes a compile bug by moving a static inline function to the right place. The body of a static inline function has to be declared before the use of this function. Signed-off-by: Dominik Hackl <dominik@hackl.dhs.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-24 20:08:12 -07:00
Andi Kleen	2df9fa3664	[PATCH] x86_64: i386/x86-64: Export cpu_core_map Needed for the powernow k8 driver for dual core support. Signed-off-by: Andi Kleen <ak@suse.de> Cc: <mark.langsdorf@amd.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-20 15:48:21 -07:00
Andi Kleen	4057923614	[PATCH] i386: Fix race in iounmap We need to hold the vmlist_lock while doing change_page_attr, otherwise we could reset someone else's mapping. Requires previous patch to add __remove_vm_area Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-20 15:48:20 -07:00
Andi Kleen	b41e29398a	[PATCH] x86_64: 386/x86-64 Further AMD dual core fixes - Remove duplicated ifdef - Make core_id match what Intel uses - Initialize phys_proc_id correctly for non DC case - Handle non power of two core numbers. Fixes for both i386 and x86-64 Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-20 15:48:20 -07:00
Christoph Lameter	ff0d2f90fd	[PATCH] fix memory scribble in arch/i386/pci/fixup.c The GET_INDEX() macro should use just the low three bits of the devfn, otherwise we have a memory scribble in pcie_rootport_aspm_quirk that overwrites ptype_all Fix it to be more careful about its arguments while at it. Acked by Dely Sy <dely.l.sy@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-17 09:27:05 -07:00
Andi Kleen	a158608bf4	[PATCH] x86_64/i386: fix defaults for physical/core id in /proc/cpuinfo Last round hopefully of cpu_core_id changes hopefully fow now: - Always initialize cpu_core_id for all CPUs, even when no dual core setup is detected. This prevents funny /proc/cpuinfo output - Do the same with phys_proc_id[] even when no HyperThreading - dito. - Use the CPU APIC-ID from CPUID 1 instead of the linux virtual CPU number to identify the core for AMD dual core setups. Patch for i386/x86-64. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-17 07:59:13 -07:00
Linus Torvalds	22490eb80c	Fix acpi_find_rsdp() - acpi_scan_rsdp takes length, not end Noticed by Jakub Jermar <jermar@itbs.cz>	2005-05-06 15:39:23 -07:00
Kianusch Sayah Karadji	4713741955	[PATCH] x86: geode support fixes - Changed Name/defines from "Geode GX" to "Geode GX1" for clarification - Dropped "-march=i586" in favor of "-march=i486" - Dopped X86_OOSTORE support for Geode GX1 Signed-off-by: Kianusch Sayah Karadji <kianusch@sk-tech.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-05 16:36:49 -07:00
Domen Puncer	125947f2ab	[PATCH] CodingStyle: trivial whitespace fixups When I do a "diff -Nur arch/i386 arch/x86_64" to see what is different between these two architectures, I see some differences due to whitespace issues only. The attached patch removes some of the noise by fixing up the following files: - arch/i386/boot/bootsect.S - arch/i386/boot/video.S - arch/x86_64/boot/bootsect.S Signed-off-by: Daniel Dickman <didickman@yahoo.com> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-05 16:36:49 -07:00
maximilian attems	a27e951f1e	[PATCH] cyrix: eliminate bad section references Fix cyrix section references: convert __initdata to __devinitdata. Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 00000379 R_386_32 .init.data Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 00000399 R_386_32 .init.data Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003b3 R_386_32 .init.data Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003b9 R_386_32 .init.data Error: ./arch/i386/kernel/cpu/mtrr/cyrix.o .text refers to 000003bf R_386_32 .init.data Signed-of-by: maximilian attems <janitor@sternwelten.at> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-05 16:36:47 -07:00
Prasanna S Panchamukhi	0b9e2cac8a	[PATCH] Kprobes: Incorrect handling of probes on ret/lret instruction Kprobes could not handle the insertion of a probe on the ret/lret instruction and used to oops after single stepping since kprobes was modifying eip/rip incorrectly. Adjustment of eip/rip is not required after single stepping in case of ret/lret instruction, because eip/rip points to the correct location after execution of the ret/lret instruction. This patch fixes the above problem. Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-05 16:36:39 -07:00
Paolo 'Blaisorblade' Giarrusso	0c28130b5c	[PATCH] x86_64: make string func definition work as intended In include/asm-x86_64/string.h there are such comments: /* Use C out of line version for memcmp / #define memcmp __builtin_memcmp int memcmp(const void cs,const void * ct,size_t count); This would mean that if the compiler does not decide to use __builtin_memcmp, it emits a call to memcmp to be satisfied by the C out-of-line version in lib/string.c. What happens is that after preprocessing, in lib/string.i you may find the definition of "__builtin_strcmp". Actually, by accident, in the object you will find the definition of strcmp and such (maybe a trick intended to redirect calls to __builtin_memcmp to the default memcmp when the definition is not expanded); however, this particular case is not a documented feature as far as I can see. Also, the EXPORT_SYMBOL does not work, so it's duplicated in the arch. I simply added some #undef to lib/string.c and removed the (now duplicated) exports in x86-64 and UML/x86_64 subarchs (the second ones are introduced by another patch I just posted for -mm). Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> CC: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-05 16:36:33 -07:00
Alexander Nyberg	f48d9663f1	[PATCH] x86 stack initialisation fix The recent change fix-crash-in-entrys-restore_all.patch childregs->esp = esp; p->thread.esp = (unsigned long) childregs; - p->thread.esp0 = (unsigned long) (childregs+1); + p->thread.esp0 = (unsigned long) (childregs+1) - 8; p->thread.eip = (unsigned long) ret_from_fork; introduces an inconsistency between esp and esp0 before the task is run the first time. esp0 is no longer the actual start of the stack, but 8 bytes off. This shows itself clearly in a scenario when a ptracer that is set to also ptrace eventual children traces program1 which then clones thread1. Now the ptracer wants to modify the registers of thread1. The x86 ptrace implementation bases it's knowledge about saved user-space registers upon p->thread.esp0. But this will be a few bytes off causing certain writes to the kernel stack to overwrite a saved kernel function address making the kernel when actually running thread1 jump out into user-space. Very spectacular. The testcase I've used is: /* start with strace -f ./a.out / #include <pthread.h> #include <stdio.h> void do_thread(void *p) { for (;;); } int main() { pthread_t one; pthread_create(&one, NULL, &do_thread, NULL); for (;;); return 0; } So, my solution is to instead of just adjusting esp0 that creates an inconsitent state I adjust where the user-space registers are saved with -8 bytes. This gives us the wanted extra bytes on the start of the stack and esp0 is now correct. This solves the issues I saw from the original testcase from Mateusz Berezecki and has survived testing here. I think this should go into -mm a round or two first however as there might be some cruft around depending on pt_regs lying on the start of the stack. That however would have broken with the first change too! It's actually a 2-line diff but I had to move the comment of why the -8 bytes are there a few lines up. Thanks to Zwane for helping me with this. Signed-off-by: Alexander Nyberg <alexn@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-05 16:36:30 -07:00
David Woodhouse	bfd4bda097	Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git	2005-05-05 13:59:37 +01:00
Al Viro	5cae841b13	[PATCH] ISA DMA Kconfig fixes - part 1 A bunch of drivers use ISA DMA helpers or their equivalents for platforms that have ISA with different DMA controller (a lot of ARM boxen). Currently there is no way to put such dependency in Kconfig - CONFIG_ISA is not it (e.g. it is not set on platforms that have no ISA slots, but have on-board devices that pretend to be ISA ones). New symbol added - ISA_DMA_API. Set when we have functional enable_dma()/set_dma_mode()/etc. set of helpers. Next patches in the series will add missing dependencies for drivers that need them. I'm very carefully staying the hell out of the recurring flamefest on what exactly CONFIG_ISA would mean in ideal world - added symbol has a well-defined meaning and for now I really want to treat it as completely independent from the mess around CONFIG_ISA. Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-04 07:33:13 -07:00
David Woodhouse	27b030d58c	Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git	2005-05-03 08:14:09 +01:00
Adrian Bunk	408b664a7d	[PATCH] make lots of things static Another large rollup of various patches from Adrian which make things static where they were needlessly exported. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:59:29 -07:00
Adrian Bunk	5f76be80d9	[PATCH] fbdev: edid.h cleanups This patch removes some completely unused code. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Antonino Daplas <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:59:23 -07:00
Jesper Juhl	7ed20e1ad5	[PATCH] convert that currently tests _NSIG directly to use valid_signal() Convert most of the current code that uses _NSIG directly to instead use valid_signal(). This avoids gcc -W warnings and off-by-one errors. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:59:14 -07:00
Jesper Juhl	e49332bd12	[PATCH] misc verify_area cleanups There were still a few comments left refering to verify_area, and two functions, verify_area_skas & verify_area_tt that just wrap corresponding access_ok_skas & access_ok_tt functions, just like verify_area does for access_ok - deprecate those. There was also a few places that still used verify_area in commented-out code, fix those up to use access_ok. After applying this one there should not be anything left but finally removing verify_area completely, which will happen after a kernel release or two. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:59:08 -07:00
Paul E. McKenney	fbd568a3e6	[PATCH] Change synchronize_kernel to _rcu and _sched This patch changes calls to synchronize_kernel(), deprecated in the earlier "Deprecate synchronize_kernel, GPL replacement" patch to instead call the new synchronize_rcu() and synchronize_sched() APIs. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:59:04 -07:00
Matt Mackall	d59745ce3e	[PATCH] clean up kernel messages Arrange for all kernel printks to be no-ops. Only available if CONFIG_EMBEDDED. This patch saves about 375k on my laptop config and nearly 100k on minimal configs. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:59:02 -07:00
Paolo 'Blaisorblade' Giarrusso	5e7b83ffc6	[PATCH] uml: fix syscall table by including $(SUBARCH)'s one, for i386 Split the i386 entry.S files into entry.S and syscall_table.S which is included in the previous one (so actually there is no difference between them) and use the syscall_table.S in the UML build, instead of tracking by hand the syscall table changes (which is inherently error-prone). We must only insert the right #defines to inject the changes we need from the i386 syscall table (for instance some different function names); also, we don't implement some i386 syscalls, as ioperm(), nor some TLS-related ones (yet to provide). Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:55 -07:00
Pavel Pisa	ad6714230f	[PATCH] Linux 2.6.x VM86 interrupt emulation fixes Patch solves VM86 interrupt emulation deadlock on SMP systems. The VM86 interrupt emulation has been heavily tested and works well on UP systems after last update, but it seems to deadlock when we have used it on SMP/HT boxes now. It seems, that disable_irq() cannot be called from interrupts, because it waits until disabled interrupt handler finishes (/kernel/irq/manage.c:synchronize_irq():while(IRQ_INPROGRESS);). This blocks one CPU after another. Solved by use disable_irq_nosync. There is the second problem. If IRQ source is fast, it is possible, that interrupt is sometimes processed and re-enabled by the second CPU, before it is disabled by the first one, but negative IRQ disable depths are not allowed. The spinlocking and disabling IRQs over call to disable_irq_nosync/enable_irq is the only solution found reliable till now. Signed-off-by: Michal Sojka <sojkam1@control.felk.cvut.cz> Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:52 -07:00
Venkatesh Pallipadi	f9ba70535d	[PATCH] Increase number of e820 entries hard limit from 32 to 128 The specifications that talk about E820 map doesn't have an upper limit on the number of e820 entries. But, today's kernel has a hard limit of 32. With increase in memory size, we are seeing the number of E820 entries reaching close to 32. Patch below bumps the number upto 128. The patch changes the location of EDDBUF in zero-page (as it comes after E820). As, EDDBUF is not used by boot loaders, this patch should not have any effect on bootloader-setup code interface. Patch covers both i386 and x86-64. Tested on: * grub booting bzImage * lilo booting bzImage with EDID info enabled * pxeboot of bzImage Side-effect: bss increases by ~ 2K and init.data increases by ~7.5K on all systems, due to increase in size of static arrays. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:51 -07:00
Zwane Mwaikambo	3c3b73b6f5	[PATCH] cpuid x87 bit on AMD falsely marked as PNI http://bugme.osdl.org/show_bug.cgi?id=4426 vendor_id : AuthenticAMD cpu family : 6 model : 10 model name : AMD Athlon(tm) XP stepping : 0 cpu MHz : 2204.807 <snipped> cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow bogomips : 4358.14 We're marking bit 0 of extended function 0x80000001 cpuid as PNI support on AMD processors, when it actually denotes x87 FPU present. Patch for i386 and x86_64 below. Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:51 -07:00
Jason Gaston	4d24a439a6	[PATCH] irq and pci_ids for Intel ICH7DH & ICH7-M DH This patch adds the Intel ICH7DH and ICH7-M DH DID's to the irq.c and pci_ids.h files. Signed-off-by: �Jason Gaston <Jason.d.gaston@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:50 -07:00
john stultz	35492df5ae	[PATCH] i386: fix hpet for systems that don't support legacy replacement Currently the i386 HPET code assumes the entire HPET implementation from the spec is present. This breaks on boxes that do not implement the optional legacy timer replacement functionality portion of the spec. This patch, which is very similar to my x86-64 patch for the same issue, fixes the problem allowing i386 systems that cannot use the HPET for the timer interrupt and RTC to still use the HPET as a time source. I've tested this patch on a system systems without HPET, with HPET but without legacy timer replacement, as well as HPET with legacy timer replacement. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:50 -07:00
Lee Revell	a6954ba2e8	[PATCH] Enable write combining for server works LE rev > 6 Enable write combining for server works LE rev > 6 per http://www.ussg.iu.edu/hypermail/linux/kernel/0104.3/1007.html Signed-Off-By: Lee Revell <rlrevell@joe-job.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:49 -07:00
Stas Sergeev	48c88211a6	[PATCH] x86: entry.S trap return fixes do_debug() and do_int3() return void. This patch fixes the CONFIG_KPROBES variant of do_int3() to return void too and adjusts entry.S accordingly. Signed-off-by: Stas Sergeev <stsp@aknet.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:49 -07:00
Jaya Kumar	a2f7c35415	[PATCH] x86 reboot: Add reboot fixup for gx1/cs5530a This patch by Jaya Kumar introduces a generic infrastructure to deal with x86 chipsets with nonstandard reset sequences, and adds support for the Geode gx1/cs5530a chipset. Signed-off-by: Jaya Kumar <jayalk@intworks.biz> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:49 -07:00
Jack F Vogel	67701ae976	[PATCH] check nmi watchdog is broken A bug against an xSeries system showed up recently noting that the check_nmi_watchdog() test was failing. I have been investigating it and discovered in both i386 and x86_64 the recent change to the routine to use the cpu_callin_map has uncovered a problem. Prior to that change, on an SMP box, the test was trivally passing because all cpu's were found to not yet be online, but now with the callin_map they are discovered, it goes on to test the counter and they have not yet begun to increment, so it announces a CPU is stuck and bails out. On all the systems I have access to test, the announcement of failure is also bougs... by the time you can login and check /proc/interrupts, the NMI count is happily incrementing on all CPUs. Its just that the test is being done too early. I have tried moving the call to the test around a bit, and it was always too early. I finally hit on this proposed solution, it delays the routine via a late_initcall(), seems like the right solution to me. Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:48 -07:00
H. J. Lu	fd51f666fa	[PATCH] i386/x86_64 segment register access update The new i386/x86_64 assemblers no longer accept instructions for moving between a segment register and a 32bit memory location, i.e., movl (%eax),%ds movl %ds,(%eax) To generate instructions for moving between a segment register and a 16bit memory location without the 16bit operand size prefix, 0x66, mov (%eax),%ds mov %ds,(%eax) should be used. It will work with both new and old assemblers. The assembler starting from 2.16.90.0.1 will also support movw (%eax),%ds movw %ds,(%eax) without the 0x66 prefix. I am enclosing patches for 2.4 and 2.6 kernels here. The resulting kernel binaries should be unchanged as before, with old and new assemblers, if gcc never generates memory access for unsigned gsindex; asm volatile("movl %%gs,%0" : "=g" (gsindex)); If gcc does generate memory access for the code above, the upper bits in gsindex are undefined and the new assembler doesn't allow it. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:58:48 -07:00
Sam Ravnborg	2cacb3da62	[PATCH] kbuild/i386: re-introduce dependency on vmlinux for install target, and add kernel_install Removing the dependency on vmlinux for the install target raised a few complaints, so instead a new target i added: kernel_install. kernel_install will install the kernel just like the ordinary install target. The only difference is that install has a dependency on vmlinux, kernel_install does not. Therefore kernel_install is the best choice when accessing the kernel over a NFS mount or as another user. kernel_install is similar to modules_install in the fact that neither does a full kernel compile before performing the install. In this way they are good for root use. Also added back the dependency on vmlinux for the install target so peoples scripts are no longer broken. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-30 16:51:42 -07:00
Linus Torvalds	a879cbbb34	x86: make traps on 'iret' be debuggable in user space This makes a trap on the 'iret' that returns us to user space cause a nice clean SIGSEGV, instead of just a hard (and silent) exit. That way a debugger can actually try to see what happened, and we also properly notify everybody who might be interested about us being gone. This loses the error code, but tells the debugger what happened with ILL_BADSTK in the siginfo.	2005-04-29 09:38:44 -07:00
	2fd6f58ba6	[AUDIT] Don't allow ptrace to fool auditing, log arch of audited syscalls. We were calling ptrace_notify() after auditing the syscall and arguments, but the debugger could have _changed_ them before the syscall was actually invoked. Reorder the calls to fix that. While we're touching ever call to audit_syscall_entry(), we also make it take an extra argument: the architecture of the syscall which was made, because some architectures allow more than one type of syscall. Also add an explicit success/failure flag to audit_syscall_exit(), for the benefit of architectures which return that in a condition register rather than only returning a single register. Change type of syscall return value to 'long' not 'int'. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2005-04-29 16:08:28 +01:00
James Bottomley	1cff94c6fe	[PATCH] fix subarch breakage in amd dual core updates The patch to arch/i386/kernel/cpu/amd.c relies on the variable cpu_core_id which is defined in i386/kernel/smpboot.c. This means it is only present if CONFIG_X86_SMP is defined, not CONFIG_SMP (alternative SMP harnesses won't have it, which is why it breaks voyager). Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-21 16:20:35 -07:00
Hugh Dickins	021740dc30	[PATCH] freepgt: hugetlb area is clean Once we're strict about clearing away page tables, hugetlb_prefault can assume there are no page tables left within its range. Since the other arches continue if !pte_none here, let i386 do the same. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-19 13:29:18 -07:00
Hugh Dickins	e0da382c92	[PATCH] freepgt: free_pgtables use vma list Recent woes with some arches needing their own pgd_addr_end macro; and 4-level clear_page_range regression since 2.6.10's clear_page_tables; and its long-standing well-known inefficiency in searching throughout the higher-level page tables for those few entries to clear and free: all can be blamed on ignoring the list of vmas when we free page tables. Replace exit_mmap's clear_page_range of the total user address space by free_pgtables operating on the mm's vma list; unmap_region use it in the same way, giving floor and ceiling beyond which it may not free tables. This brings lmbench fork/exec/sh numbers back to 2.6.10 (unless preempt is enabled, in which case latency fixes spoil unmap_vmas throughput). Beware: the do_mmap_pgoff driver failure case must now use unmap_region instead of zap_page_range, since a page table might have been allocated, and can only be freed while it is touched by some vma. Move free_pgtables from mmap.c to memory.c, where its lower levels are adapted from the clear_page_range levels. (Most of free_pgtables' old code was actually for a non-existent case, prev not properly set up, dating from before hch gave us split_vma.) Pass mmu_gather** in the public interfaces, since we might want to add latency lockdrops later; but no attempt to do so yet, going by vma should itself reduce latency. But what if is_hugepage_only_range? Those ia64 and ppc64 cases need careful examination: put that off until a later patch of the series. What of x86_64's 32bit vdso page __map_syscall32 maps outside any vma? And the range to sparc64's flush_tlb_pgtables? It's less clear to me now that we need to do more than is done here - every PMD_SIZE ever occupied will be flushed, do we really have to flush every PGDIR_SIZE ever partially occupied? A shame to complicate it unnecessarily. Special thanks to David Miller for time spent repairing my ceilings. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-19 13:29:15 -07:00
Chris Wedgwood	15e8869943	[PATCH] x86: fix acpi compile without CONFIG_ACPI_BUS The recent acpi boot patch breaks for me: acpi_fadt needs CONFIG_ACPI_BUS. Signed-off-By: Chris Wedgwood <cw@f00f.org> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-18 08:01:30 -07:00
Coywolf Qi Hunt	6c46ada700	[PATCH] reparent_to_init cleanup This patch hides reparent_to_init(). reparent_to_init() should only be called by daemonize(). Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:26:01 -07:00
maximilian attems	c41f5eb3b8	[PATCH] efi: eliminate bad section references Randy please double check especially this one. there may be a better solution. Fix efi section references: remove __initdata for struct efi efi_phys and struct efi_memory_map memmap Error: ./arch/i386/kernel/efi.o .text refers to 000000d3 R_386_32 .init.data Error: ./arch/i386/kernel/efi.o .text refers to 000000ff R_386_32 .init.data efi_memmap_walk (which is not __init nor static) accesses both efi_phys and memmap. Signed-off-by: maximilian attems <janitor@sternwelten.at> Acked-by: Randy Dunlap <rddunlap@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:25:53 -07:00
Pavel Machek	438510f6f0	[PATCH] pm_message_t: more fixes in common and i386 I thought I'm done with fixing u32 vs. pm_message_t ... unfortunately that turned out not to be the case as Russel King pointed out. Here are fixes for Documentation and common code (mainly system devices). Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:25:24 -07:00
Siddha, Suresh B	d31ddaa172	[PATCH] x86, x86_64: dual core proc-cpuinfo and sibling-map fix - broken sibling_map setup in x86_64 - grouping all the core and HT related cpuinfo fields. We are reasonably sure that adding new cpuinfo fields after "siblings" field, will not cause any app failure. Thats because today's /proc/cpuinfo format is completely different on x86, x86_64 and we haven't heard of any x86 app breakage because of this issue. Grouping these fields will result in more or less common format on all architectures (ia64, x86 and x86_64) and will cause less confusion. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:25:20 -07:00
Andi Kleen	635186447d	[PATCH] x86_64: Final support for AMD dual core Clean up the code greatly. Now uses the infrastructure from the Intel dual core patch Should fix a final bug noticed by Tyan of not detecting the nodes correctly in some corner cases. Patch for x86-64 and i386 Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:25:16 -07:00
Andi Kleen	3dd9d51484	[PATCH] x86_64: add support for Intel dual-core detection and displaying Appended patch adds the support for Intel dual-core detection and displaying the core related information in /proc/cpuinfo. It adds two new fields "core id" and "cpu cores" to x86 /proc/cpuinfo and the "core id" field for x86_64("cpu cores" field is already present in x86_64). Number of processor cores in a die is detected using cpuid(4) and this is documented in IA-32 Intel Architecture Software Developer's Manual (vol 2a) (http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a) This patch also adds cpu_core_map similar to cpu_sibling_map. Slightly hacked by AK. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:25:15 -07:00
Siddha, Suresh B	cf94b62f70	[PATCH] x86_64-always-use-cpuid-80000008-to-figure-out-mtrr fix We need to use the size_and_mask in set_mtrr_var_ranges(which is called while programming MTRR's for AP's Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:25:11 -07:00
Andi Kleen	1f2c958ad5	[PATCH] x86_64: Always use CPUID 80000008 to figure out MTRR address space size It doesn't make sense to only do this only for AMD K8. This would support future CPUs with extended address spaces properly. For i386 and x86-64 Cc: <davej@redhat.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:25:10 -07:00
Jason Davis	90660ec3c3	[PATCH] x86_64 genapic update x86_64 genapic mechanism should be aware of machines that use physical APIC mode regardless of how many clusters/processors are detected. ACPI 3.0 FADT makes this determination very simple by providing a feature flag "force_apic_physical_destination_mode" to state whether the machine unconditionally uses physical APIC mode. Unisys' next generation x86_64 ES7000 will need to utilize this FADT feature flag in order to boot the x86_64 kernel in the correct APIC mode. This patch has been tested on both x86_64 commodity and ES7000 boxes. Signed-off-by: Jason Davis <jason.davis@unisys.com> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:24:53 -07:00
Andi Kleen	db46868128	[PATCH] x86-64/i386: Revert cpuinfo siblings behaviour back to 2.6.10 Only display physical id/siblings when there are siblings or dual core. In 2.6.11 I accidentially broke it and it was always displaying these fields But for compatibility to all these /proc parsers around it is better to do it in the old way again. Noticed by Suresh Siddha Cc: <Suresh.b.siddha@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:24:51 -07:00
Roland McGrath	c97db4a0a7	[PATCH] i386 vDSO: add PT_NOTE segment This patch adds an ELF note to the vDSO giving the LINUX_VERSION_CODE value. Having this in the vDSO lets the dynamic linker avoid the `uname' syscall it now always does at startup to ascertain the kernel ABI available. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:24:48 -07:00
Roland McGrath	ecd02dddd1	[PATCH] i386: Use loaddebug macro consistently This moves the macro loaddebug from asm-i386/suspend.h to asm-i386/processor.h, which is the place that makes sense for it to be defined, removes the extra copy of the same macro in arch/i386/kernel/process.c, and makes arch/i386/kernel/signal.c use the macro in place of its expansion. This is a purely cosmetic cleanup for the normal i386 kernel. However, it is handy for Xen to be able to just redefine the loaddebug macro once instead of also changing the signal.c code. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:24:46 -07:00
Jason Gaston	e285f8091b	[PATCH] irq and pci_ids: patch for Intel ESB2 This patch adds the Intel ESB2 DID's to the irq.c and pci_ids.h files. Signed-off-by: Jason Gaston <Jason.d.gaston@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:24:41 -07:00
Stas Sergeev	5df240826c	[PATCH] fix crash in entry.S restore_all Fix the access-above-bottom-of-stack crash. 1. Allows to preserve the valueable optimization 2. Works for NMIs 3. Doesn't care whether or not there are more of the like instances where the stack is left empty. 4. Seems to work for me without the crashes:) (akpm: this is still under discussion, although I _think_ it's OK. You might want to hold off) Signed-off-by: Stas Sergeev <stsp@aknet.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:24:01 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

... 3 4 5 6 7

339 Commits