linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-25 05:34:00 +08:00

Author	SHA1	Message	Date
Thomas Gleixner	771ee3b04e	[PATCH] Add a function to handle interrupt affinity setting Provide funtions to: - check, whether an interrupt can set the affinity - pin the interrupt to a given cpu Necessary for the ability to setup clocksources more flexible (e.g. use the different HPET channels per CPU) [akpm@osdl.org: alpha build fix] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-16 08:13:56 -08:00
Thomas Gleixner	950f4427c2	[PATCH] Add irq flag to disable balancing for an interrupt Add a flag so we can prevent the irq balancing of an interrupt. Move the bits, so we have room for more :) Necessary for the ability to setup clocksources more flexible (e.g. use the different HPET channels per CPU) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-16 08:13:56 -08:00
Linus Torvalds	414f827c46	Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6 * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (94 commits) [PATCH] x86-64: Remove mk_pte_phys() [PATCH] i386: Fix broken CONFIG_COMPAT_VDSO on i386 [PATCH] i386: fix 32-bit ioctls on x64_32 [PATCH] x86: Unify pcspeaker platform device code between i386/x86-64 [PATCH] i386: Remove extern declaration from mm/discontig.c, put in header. [PATCH] i386: Rename cpu_gdt_descr and remove extern declaration from smpboot.c [PATCH] i386: Move mce_disabled to asm/mce.h [PATCH] i386: paravirt unhandled fallthrough [PATCH] x86_64: Wire up compat epoll_pwait [PATCH] x86: Don't require the vDSO for handling a.out signals [PATCH] i386: Fix Cyrix MediaGX detection [PATCH] i386: Fix warning in cpu initialization [PATCH] i386: Fix warning in microcode.c [PATCH] x86: Enable NMI watchdog for AMD Family 0x10 CPUs [PATCH] x86: Add new CPUID bits for AMD Family 10 CPUs in /proc/cpuinfo [PATCH] i386: Remove fastcall in paravirt.[ch] [PATCH] x86-64: Fix wrong gcc check in bitops.h [PATCH] x86-64: survive having no irq mapping for a vector [PATCH] i386: geode configuration fixes [PATCH] i386: add option to show more code in oops reports ...	2007-02-14 09:46:06 -08:00
Eric W. Biederman	d912b0cc1a	[PATCH] sysctl: add a parent entry to ctl_table and set the parent entry Add a parent entry into the ctl_table so you can walk the list of parents and find the entire path to a ctl_table entry. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:10:00 -08:00
Eric W. Biederman	77b14db502	[PATCH] sysctl: reimplement the sysctl proc support With this change the sysctl inodes can be cached and nothing needs to be done when removing a sysctl table. For a cost of 2K code we will save about 4K of static tables (when we remove de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or about half that on a 32bit arch. The speed feels about the same, even though we can now cache the sysctl dentries :( We get the core advantage that we don't need to have a 1 to 1 mapping between ctl table entries and proc files. Making it possible to have /proc/sys vary depending on the namespace you are in. The currently merged namespaces don't have an issue here but the network namespace under /proc/sys/net needs to have different directories depending on which network adapters are visible. By simply being a cache different directories being visible depending on who you are is trivial to implement. [akpm@osdl.org: fix uninitialised var] [akpm@osdl.org: fix ARM build] [bunk@stusta.de: make things static] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:10:00 -08:00
Eric W. Biederman	1ff007eb8e	[PATCH] sysctl: allow sysctl_perm to be called from outside of sysctl.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Eric W. Biederman	805b5d5e06	[PATCH] sysctl: factor out sysctl_head_next from do_sysctl The current logic to walk through the list of sysctl table headers is slightly painful and implement in a way it cannot be used by code outside sysctl.c I am in the process of implementing a version of the sysctl proc support that instead of using the proc generic non-caching monster, just uses the existing sysctl data structure as backing store for building the dcache entries and for doing directory reads. To use the existing data structures however I need a way to get at them. [akpm@osdl.org: warning fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Eric W. Biederman	0b4d414714	[PATCH] sysctl: remove insert_at_head from register_sysctl The semantic effect of insert_at_head is that it would allow new registered sysctl entries to override existing sysctl entries of the same name. Which is pain for caching and the proc interface never implemented. I have done an audit and discovered that none of the current users of register_sysctl care as (excpet for directories) they do not register duplicate sysctl entries. So this patch simply removes the support for overriding existing entries in the sys_sysctl interface since no one uses it or cares and it makes future enhancments harder. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andi Kleen <ak@muc.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Corey Minyard <minyard@acm.org> Cc: Neil Brown <neilb@suse.de> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Jan Kara <jack@ucw.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Eric W. Biederman	ae83681026	[PATCH] sysctl: remove support for directory strategy routines parse_table has support for calling a strategy routine when descending into a directory. To date no one has used this functionality and the /proc/sys interface has no analog to it. So no one is using this functionality kill it and make the binary sysctl code easier to follow. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Eric W. Biederman	6703ddfcce	[PATCH] sysctl: remove support for CTL_ANY There are currently no users in the kernel for CTL_ANY and it only has effect on the binary interface which is practically unused. So this complicates sysctl lookups for no good reason so just remove it. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Eric W. Biederman	2abc26fc6b	[PATCH] sysctl: create sys/fs/binfmt_misc as an ordinary sysctl entry binfmt_misc has a mount point in the middle of the sysctl and that mount point is created as a proc_generic directory. Doing it that way gets in the way of cleaning up the sysctl proc support as it continues the existence of a horrible hack. So instead simply create the directory as an ordinary sysctl directory. At least that removes the magic special case. [akpm@osdl.org: warning fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Eric W. Biederman	a5494dcd8b	[PATCH] sysctl: move SYSV IPC sysctls to their own file This is just a simple cleanup to keep kernel/sysctl.c from getting to crowded with special cases, and by keeping all of the ipc logic to together it makes the code a little more readable. [gcoady.lk@gmail.com: build fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Grant Coady <gcoady.lk@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Eric W. Biederman	39732acd96	[PATCH] sysctl: move utsname sysctls to their own file This is just a simple cleanup to keep kernel/sysctl.c from getting to crowded with special cases, and by keeping all of the utsname logic to together it makes the code a little more readable. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:58 -08:00
Eric W. Biederman	b04c3afb2b	[PATCH] sysctl: move init_irq_proc into init/main where it belongs Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:58 -08:00
Thomas Gleixner	38515e908b	[PATCH] Scheduled removal of SA_xxx interrupt flags fixups The obsolete SA_xxx interrupt flags have been used despite the scheduled removal. Fixup the remaining users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: Wim Van Sebroeck <wim@iguana.be> Cc: Roland Dreier <rolandd@cisco.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: Dave Airlie <airlied@linux.ie> Cc: James Simmons <jsimmons@infradead.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:54 -08:00
Tim Schmielau	cd354f1ae7	[PATCH] remove many unneeded #includes of sched.h After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:54 -08:00
Andi Kleen	a98f0dd34d	[PATCH] x86-64: Allow to run a program when a machine check event is detected When a machine check event is detected (including a AMD RevF threshold overflow event) allow to run a "trigger" program. This allows user space to react to such events sooner. The trigger is configured using a new trigger entry in the machinecheck sysfs interface. It is currently shared between all CPUs. I also fixed the AMD threshold handler to run the machine check polling code immediately to actually log any events that might have caused the threshold interrupt. Also added some documentation for the mce sysfs interface. Signed-off-by: Andi Kleen <ak@suse.de>	2007-02-13 13:26:23 +01:00
Zachary Amsden	9226d125d9	[PATCH] i386: paravirt CPU hypercall batching mode The VMI ROM has a mode where hypercalls can be queued and batched. This turns out to be a significant win during context switch, but must be done at a specific point before side effects to CPU state are visible to subsequent instructions. This is similar to the MMU batching hooks already provided. The same hooks could be used by the Xen backend to implement a context switch multicall. To explain a bit more about lazy modes in the paravirt patches, basically, the idea is that only one of lazy CPU or MMU mode can be active at any given time. Lazy MMU mode is similar to this lazy CPU mode, and allows for batching of multiple PTE updates (say, inside a remap loop), but to avoid keeping some kind of state machine about when to flush cpu or mmu updates, we just allow one or the other to be active. Although there is no real reason a more comprehensive scheme could not be implemented, there is also no demonstrated need for this extra complexity. Signed-off-by: Zachary Amsden <zach@vmware.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org>	2007-02-13 13:26:21 +01:00
Eric Dumazet	5809f9d442	[PATCH] x86-64: get rid of ARCH_HAVE_XTIME_LOCK ARCH_HAVE_XTIME_LOCK is used by x86_64 arch . This arch needs to place a read only copy of xtime_lock into vsyscall page. This read only copy is named __xtime_lock, and xtime_lock is defined in arch/x86_64/kernel/vmlinux.lds.S as an alias. So the declaration of xtime_lock in kernel/timer.c was guarded by ARCH_HAVE_XTIME_LOCK define, defined to true on x86_64. We can get same result with _attribute__((weak)) in the declaration. linker should do the job. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org>	2007-02-13 13:26:21 +01:00
Arjan van de Ven	92e1d5be91	[PATCH] mark struct inode_operations const 2 Many struct inode_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:46 -08:00
Arjan van de Ven	9a32144e9d	[PATCH] mark struct file_operations const 7 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:46 -08:00
Nick Piggin	ff91691bcc	[PATCH] sched: avoid div in rebalance_tick Avoid expensive integer divide 3 times per CPU per tick. A userspace test of this loop went from 26ns, down to 19ns on a G5; and from 123ns down to 28ns on a P3. (Also avoid a variable bit shift, as suggested by Alan. The effect of this wasn't noticable on the CPUs I tested with). Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:37 -08:00
Eric W. Biederman	27b0b2f44a	[PATCH] pid: remove the now unused kill_pg kill_pg_info and __kill_pg_info Now that I have changed all of the in-tree users remove the old version of these functions. This should make it clear to any out of tree users that they should be using kill_pgrp kill_pgrp_info or __kill_pgrp_info instead. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:32 -08:00
Eric W. Biederman	41487c65bf	[PATCH] pid: replace do/while_each_task_pid with do/while_each_pid_task There isn't any real advantage to this change except that it allows the old functions to be removed. Which is easier on maintenance and puts the code in a more uniform style. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:32 -08:00
Eric W. Biederman	ab521dc0f8	[PATCH] tty: update the tty layer to work with struct pid Of kernel subsystems that work with pids the tty layer is probably the largest consumer. But it has the nice virtue that the assiation with a session only lasts until the session leader exits. Which means that no reference counting is required. So using struct pid winds up being a simple optimization to avoid hash table lookups. In the long term the use of pid_nr also ensures that when we have multiple pid spaces mixed everything will work correctly. Signed-off-by: Eric W. Biederman <eric@maxwell.lnxi.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:32 -08:00
Eric W. Biederman	3e7cd6c413	[PATCH] pid: replace is_orphaned_pgrp with is_current_pgrp_orphaned Every call to is_orphaned_pgrp passed in process_group(current) which is racy with respect to another thread changing our process group. It didn't bite us because we were dealing with integers and the worse we would get would be a stale answer. In switching the checks to use struct pid to be a little more efficient and prepare the way for pid namespaces this race became apparent. So I simplified the calls to the more specialized is_current_pgrp_orphaned so I didn't have to worry about making logic changes to avoid the race. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:32 -08:00
Eric W. Biederman	0475ac0845	[PATCH] pid: use struct pid for talking about process groups in exitc Modify has_stopped_jobs and will_become_orphan_pgrp to use struct pid based process groups. This reduces the number of hash tables looks ups and paves the way for multiple pid spaces. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:32 -08:00
Eric W. Biederman	04a2e6a5cb	[PATCH] pid: make session_of_pgrp use struct pid instead of pid_t To properly implement a pid namespace I need to deal exclusively in terms of struct pid, because pid_t values become ambiguous. To this end session_of_pgrp is transformed to take and return a struct pid pointer. To avoid the need to worry about reference counting I now require my caller to hold the appropriate locks. Leaving callers repsonsible for increasing the reference count if they need access to the result outside of the locks. Since session_of_pgrp currently only has one caller and that caller simply uses only test the result for equality with another process group, the locking change means I don't actually have to acquire the tasklist_lock at all. tiocspgrp is also modified to take and release the lock. The logic there is a little more complicated but nothing I won't need when I convert pgrp of a tty to a struct pid pointer. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:31 -08:00
Eric W. Biederman	8d42db189c	[PATCH] signal: rewrite kill_something_info so it uses newer helpers The goal is to remove users of the old signal helper functions so they can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:31 -08:00
Ingo Molnar	944be0b224	[PATCH] close_files(): add scheduling point close_files() can sometimes take long enough to trigger the soft lockup detector. Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:30 -08:00
Alan Cox	3f05044715	[PATCH] kernel: shut up the IRQ mismatch messages The problem is various drivers legally validly and sensibly try to claim IRQs but the kernel insists on vomiting forth a giant irrelevant debugging spew when the types clash. Edit kernel/irq/manage.c go down to mismatch: in setup_irq() and ifdef out the if clause that checks for mismatches. It'll then just do the right thing and work sanely. For the current -mm kernel this will do the trick (and moves it into shared irq debugging as in debug mode the info spew is useful). I've had a variant of this in my private tree for some time as I got fed up on the mess on boxes where old legacy IRQs get reused. Signed-off-by: Alan Cox <alan@redhat.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:28 -08:00
David Woodhouse	a304e1b828	[PATCH] Debug shared irqs Drivers registering IRQ handlers with SA_SHIRQ really ought to be able to handle an interrupt happening before request_irq() returns. They also ought to be able to handle an interrupt happening during the start of their call to free_irq(). Let's test that hypothesis.... [bunk@stusta.de: Kconfig fixes] Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:28 -08:00
Al Viro	5ea8176994	[PATCH] sort the devres mess out * Split the implementation-agnostic stuff in separate files. * Make sure that targets using non-default request_irq() pull kernel/irq/devres.o * Introduce new symbols (HAS_IOPORT and HAS_IOMEM) defaulting to positive; allow architectures to turn them off (we needed these symbols anyway for dependencies of quite a few drivers). * protect the ioport-related parts of lib/devres.o with CONFIG_HAS_IOPORT. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 11:18:07 -08:00
Alexey Dobriyan	4b98d11b40	[PATCH] ifdef ->rchar, ->wchar, ->syscr, ->syscw from task_struct They are fat: 4x8 bytes in task_struct. They are uncoditionally updated in every fork, read, write and sendfile. They are used only if you have some "extended acct fields feature". And please, please, please, read(2) knows about bytes, not characters, why it is called "rchar"? Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 11:18:07 -08:00
Oleg Nesterov	8d06087714	[PATCH] _proc_do_string(): fix short reads If you try to read things like /proc/sys/kernel/osrelease with single-byte reads, you get just one byte and then EOF. This is because _proc_do_string() assumes that the caller is read()ing into a buffer which is large enough to fit the whole string in a single hit. Fix. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 11:18:07 -08:00
Robert P. J. Day	501b9ebf43	[PATCH] Fix apparent typo CONFIG_LOCKDEP_DEBUG Replace the apparent typo CONFIG_LOCKDEP_DEBUG with the correct CONFIG_DEBUG_LOCKDEP. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 11:18:06 -08:00
Mathieu Desnoyers	1efc5da3cf	[PATCH] order of lockdep off/on in vprintk() should be changed The order of locking between lockdep_off/on() and local_irq_save/restore() in vprintk() should be changed. * In kernel/printk.c : vprintk() does : preempt_disable() local_irq_save() lockdep_off() spin_lock(&logbuf_lock) spin_unlock(&logbuf_lock) if(!down_trylock(&console_sem)) up(&console_sem) lockdep_on() local_irq_restore() preempt_enable() The goals here is to make sure we do not call printk() recursively from kernel/lockdep.c:__lock_acquire() (called from spin_* and down/up) nor from kernel/lockdep.c:trace_hardirqs_on/off() (called from local_irq_restore/save). It can then potentially call printk() through mark_held_locks/mark_lock. It correctly protects against the spin_lock call and the up/down call, but it does not protect against local_irq_restore. It could cause infinite recursive printk/trace_hardirqs_on() calls when printk() is called from the mark_lock() error handing path. We should change the locking so it becomes correct : preempt_disable() lockdep_off() local_irq_save() spin_lock(&logbuf_lock) spin_unlock(&logbuf_lock) if(!down_trylock(&console_sem)) up(&console_sem) local_irq_restore() lockdep_on() preempt_enable() Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 11:18:06 -08:00
Kirill Korotaev	e3e8a75d2a	[PATCH] Extract and use wake_up_klogd() Remove hack with printing space to wake up klogd. Use explicit wake_up_klogd(). See earlier discussion http://groups.google.com/group/fa.linux.kernel/browse_frm/thread/75f496668409f58d/1a8f28983a51e1ff?lnk=st&q=wake_up_klogd+group%3Afa.linux.kernel&rnum=2#1a8f28983a51e1ff Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:34 -08:00
Ingo Molnar	11f57cedcf	[PATCH] audit: fix audit_filter_user_rules() initialization bug gcc emits this warning: kernel/auditfilter.c: In function 'audit_filter_user': kernel/auditfilter.c:1611: warning: 'state' is used uninitialized in this function I tend to agree with gcc - there are a couple of plausible exit paths from audit_filter_user_rules() where it does not set 'state', keeping the variable uninitialized. For example if a filter rule has an AUDIT_POSSIBLE action. Initialize to 'wont audit'. Fix whitespace damage too. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:34 -08:00
Kyle McMartin	d4d23add3a	[PATCH] Common compat_sys_sysinfo I noticed that almost all architectures implemented exactly the same sys32_sysinfo... except parisc, where a bug was to be found in handling of the uptime. So let's remove a whole whack of code for fun and profit. Cribbed compat_sys_sysinfo from x86_64's implementation, since I figured it would be the best tested. This patch incorporates Arnd's suggestion of not using set_fs/get_fs, but instead extracting out the common code from sys_sysinfo. Cc: Christoph Hellwig <hch@infradead.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:32 -08:00
Robert P. J. Day	72fd4a35a8	[PATCH] Numerous fixes to kernel-doc info in source files. A variety of (mostly) innocuous fixes to the embedded kernel-doc content in source files, including: * make multi-line initial descriptions single line * denote some function names, constants and structs as such * change erroneous opening '/' to '/' in a few places reword some text for clarity Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: "Randy.Dunlap" <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:32 -08:00
Alexey Dobriyan	b653d081c1	[PATCH] proc: remove useless (and buggy) ->nlink settings Bug: pnx8550 code creates directory but resets ->nlink to 1. create_proc_entry() et al will correctly set ->nlink for you. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Corey Minyard <minyard@acm.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Greg KH <greg@kroah.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:32 -08:00
Andrew Morton	cb799b8988	[PATCH] sysctl warning fix kernel/sysctl.c:2816: warning: 'sysctl_ipc_data' defined but not used Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:31 -08:00
Helge Deller	3db5db4fcd	[PATCH] use cycle_t instead of u64 in struct time_interpolator The 32bit and 64bit PARISC Linux kernels suffers from the problem, that the gettimeofday() call sometimes returns non-monotonic times. The easiest way to fix this, is to drop the PARISC-specific implementation and switch over to the generic TIME_INTERPOLATION framework. But in order to make it even compile on 32bit PARISC, the patch below which touches the generic Linux code, is mandatory. More information and the full patch with the parisc-specific changes is included in this thread: http://lists.parisc-linux.org/pipermail/parisc-linux/2006-December/031003.html As far as I could see, this patch does not change anything for the existing architectures which use this framework (IA64 and SPARC64), since "cycles_t" is defined there as unsigned 64bit-integer anyway (which then makes this patch a no-change for them). Signed-off-by: Helge Deller <deller@gmx.de> Cc: <linux-arch@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:31 -08:00
Theodore Ts'o	34f5a39899	[PATCH] Add TAINT_USER and ability to set taint flags from userspace Allow taint flags to be set from userspace by writing to /proc/sys/kernel/tainted, and add a new taint flag, TAINT_USER, to be used when userspace has potentially done something dangerous that might compromise the kernel. This will allow support personnel to ask further questions about what may have caused the user taint flag to have been set. For example, they might examine the logs of the realtime JVM to see if the Java program has used the really silly, stupid, dangerous, and completely-non-portable direct access to physical memory feature which MUST be implemented according to the Real-Time Specification for Java (RTSJ). Sigh. What were those silly people at Sun thinking? [akpm@osdl.org: build fix] [bunk@stusta.de: cleanup] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:29 -08:00
Alexey Dobriyan	b035b6de24	[PATCH] Consolidate default sched_clock() Use attribute(weak). Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:28 -08:00
Mathieu Desnoyers	23c887522e	[PATCH] Relay: add CPU hotplug support Mathieu originally needed to add this for tracing Xen, but it's something that's needed for any application that can be tracing while cpus are added. unplug isn't supported by this patch. The thought was that at minumum a new buffer needs to be added when a cpu comes up, but it wasn't worth the effort to remove buffers on cpu down since they'd be freed soon anyway when the channel was closed. [zanussi@us.ibm.com: avoid lock_cpu_hotplug deadlock] Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:28 -08:00
Robert P. J. Day	c376222960	[PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc(). Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the corresponding "kmem_cache_zalloc()" call. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Roland McGrath <roland@redhat.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Acked-by: Joel Becker <Joel.Becker@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Kara <jack@ucw.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:27 -08:00
Jason Baron	068135e635	[PATCH] lockdep: add graph depth information to /proc/lockdep Generate locking graph information into /proc/lockdep, for lock hierarchy documentation and visualization purposes. sample output: c089fd5c OPS: 138 FD: 14 BD: 1 --..: &tty->termios_mutex -> [c07a3430] tty_ldisc_lock -> [c07a37f0] &port_lock_key -> [c07afdc0] &rq->rq_lock_key#2 The lock classes listed are all the first-hop lock dependencies that lockdep has seen so far. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:26 -08:00
Jarek Poplawski	381a229209	[PATCH] lockdep: more unlock-on-error fixes - returns after DEBUG_LOCKS_WARN_ON added in 3 places - debug_locks checking after lookup_chain_cache() added in __lock_acquire() - locking for testing and changing global variable max_lockdep_depth added in __lock_acquire() From: Ingo Molnar <mingo@elte.hu> My __acquire_lock() cleanup introduced a locking bug: on SMP systems we'd release a non-owned graph lock. Fix this by moving the graph unlock back, and by leaving the max_lockdep_depth variable update possibly racy. (we dont care, it's just statistics) Also add some minimal debugging code to graph_unlock()/graph_lock(), which caught this locking bug. Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:26 -08:00
Oleg Nesterov	0c12b51712	[PATCH] kill_pid_info: kill acquired_tasklist_lock Kill acquired_tasklist_lock, sig_needs_tasklist() is very cheap nowadays. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:26 -08:00
Alexey Dobriyan	3ee75ac3c0	[PATCH] sysctl_{,ms_}jiffies: fix oldlen semantics currently it's 1) if oldlenp == 0, don't writeback anything 2) if oldlenp >= table->maxlen, don't writeback more than table->maxlen bytes and rewrite oldlenp don't look at underlying type granularity 3) if 0 < oldlenp < table->maxlen, cough string sysctls don't writeback more than oldlenp bytes. OK, that's because sizeof(char) == 1 int sysctls writeback anything in (0, table->maxlen] range Though accept integers divisible by sizeof(int) for writing. sysctl_jiffies and sysctl_ms_jiffies don't writeback anything but sizeof(int), which violates 1) and 2). So, make sysctl_jiffies and sysctl_ms_jiffies accept a) oldlenp == 0, not doing writeback b) oldlenp >= sizeof(int), writing one integer. -EINVAL still returned for oldlenp == 1, 2, 3. Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:24 -08:00
Mathieu Desnoyers	dc29a3657b	[PATCH] kernel/time/clocksource.c needs struct task_struct on m68k kernel/time/clocksource.c needs struct task_struct on m68k. Because it uses spin_unlock_irq(), which, on m68k, uses hardirq_count(), which uses preempt_count(), which needs to dereference struct task_struct, we have to include sched.h. Because it would cause a loop inclusion, we cannot include sched.h in any other of asm-m68k/system.h, linux/thread_info.h, linux/hardirq.h, which leaves this ugly include in a C file as the only simple solution. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Ingo Molnar <mingo@elte.hu> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:20 -08:00
Rafael J. Wysocki	2b5b09b3b5	[PATCH] swsusp: Change pm_ops handling by userland interface Make the userland interface of swsusp call pm_ops->finish() after enable_nonboot_cpus() and before resume_device(), as indicated by the recent discussion on Linux-PM (cf. http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html). This patch changes the SNAPSHOT_PMOPS ioctl so that its first function, PMOPS_PREPARE, only sets a switch turning the platform suspend mode on, and its last function, PMOPS_FINISH, only checks if the platform mode is enabled. This should allow the older userland tools to work with new kernels without any modifications. The changes here only affect the userland interface of swsusp. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Greg KH <greg@kroah.com> Cc: Nigel Cunningham <nigel@suspend2.net> Cc: Patrick Mochel <mochel@digitalimplant.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:20 -08:00
Andrew Morton	d12c610e08	[PATCH] swsusp-change-code-ordering-in-userc-sanity The compiler will do that. And if it doesn't, we don't want to either ;) Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Greg KH <greg@kroah.com> Cc: Nigel Cunningham <nigel@suspend2.net> Cc: Patrick Mochel <mochel@digitalimplant.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:19 -08:00
Rafael J. Wysocki	259130526c	[PATCH] swsusp: Change code ordering in user.c Change the ordering of code in kernel/power/user.c so that device_suspend() is called before disable_nonboot_cpus() and device_resume() is called after enable_nonboot_cpus(). This is needed to make the userland suspend call pm_ops->finish() after enable_nonboot_cpus() and before device_resume(), as indicated by the recent discussion on Linux-PM (cf. http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html). The changes here only affect the userland interface of swsusp. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Greg KH <greg@kroah.com> Cc: Nigel Cunningham <nigel@suspend2.net> Cc: Patrick Mochel <mochel@digitalimplant.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:19 -08:00
Rafael J. Wysocki	ed746e3b18	[PATCH] swsusp: Change code ordering in disk.c Change the ordering of code in kernel/power/disk.c so that device_suspend() is called before disable_nonboot_cpus() and platform_finish() is called after enable_nonboot_cpus() and before device_resume(), as indicated by the recent discussion on Linux-PM (cf. http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html). The changes here only affect the built-in swsusp. [alexey.y.starikovskiy@linux.intel.com: fix LED blinking during image load] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Greg KH <greg@kroah.com> Cc: Nigel Cunningham <nigel@suspend2.net> Cc: Patrick Mochel <mochel@digitalimplant.org> Cc: Alexey Starikovskiy <alexey.y.starikovskiy@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:19 -08:00
Rafael J. Wysocki	e3c7db621b	[PATCH] PM: Change code ordering in main.c As indicated in a recent thread on Linux-PM, it's necessary to call pm_ops->finish() before devce_resume(), but enable_nonboot_cpus() has to be called before pm_ops->finish() (cf. http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html). For consistency, it seems reasonable to call disable_nonboot_cpus() after device_suspend(). This way the suspend code will remain symmetrical with respect to the resume code and it may allow us to speed up things in the future by suspending and resuming devices and/or saving the suspend image in many threads. The following series of patches reorders the suspend and resume code so that nonboot CPUs are disabled after devices have been suspended and enabled before the devices are resumed. It also causes pm_ops->finish() to be called after enable_nonboot_cpus() wherever necessary. This patch: Change the ordering of code in kernel/power/main.c so that device_suspend() is called before disable_nonboot_cpus() and pm_ops->finish() is called after enable_nonboot_cpus() and before device_resume(), as indicated by recent discussion on Linux-PM (cf. http://lists.osdl.org/pipermail/linux-pm/2006-November/004164.html). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Greg KH <greg@kroah.com> Cc: Nigel Cunningham <nigel@suspend2.net> Cc: Patrick Mochel <mochel@digitalimplant.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:19 -08:00
Eric Paris	6ff1b4426e	[PATCH] make reading /proc/sys/kernel/cap-bould not require CAP_SYS_MODULE Reading /proc/sys/kernel/cap-bound requires CAP_SYS_MODULE. (see proc_dointvec_bset in kernel/sysctl.c) sysctl appears to drive all over proc reading everything it can get it's hands on and is complaining when it is being denied access to read cap-bound. Clearly writing to cap-bound should be a sensitive operation but requiring CAP_SYS_MODULE to read cap-bound seems a bit to strong. I believe the information could with reasonable certainty be obtained by looking at a bunch of the output of /proc/pid/status which has very low security protection, so at best we are just getting a little obfuscation of information. Currently SELinux policy has to 'dontaudit' capability checks for CAP_SYS_MODULE for things like sysctl which just want to read cap-bound. In doing so we also as a byproduct have to hide warnings of potential exploits such as if at some time that sysctl actually tried to load a module. I wondered if anyone would have a problem opening cap-bound up to read from anyone? Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:19 -08:00
Christoph Lameter	9617729941	[PATCH] Drop free_pages() nr_free_pages is now a simple access to a global variable. Make it a macro instead of a function. The nr_free_pages now requires vmstat.h to be included. There is one occurrence in power management where we need to add the include. Directly refrer to global_page_state() there to clarify why the #include was added. [akpm@osdl.org: arm build fix] [akpm@osdl.org: sparc64 build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:18 -08:00
Christoph Lameter	d23ad42324	[PATCH] Use ZVC for free_pages This is again simplifies some of the VM counter calculations through the use of the ZVC consolidated counters. [michal.k.k.piotrowski@gmail.com: build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:17 -08:00
Tejun Heo	9ac7849e35	devres: device resource management Implement device resource management, in short, devres. A device driver can allocate arbirary size of devres data which is associated with a release function. On driver detach, release function is invoked on the devres data, then, devres data is freed. devreses are typed by associated release functions. Some devreses are better represented by single instance of the type while others need multiple instances sharing the same release function. Both usages are supported. devreses can be grouped using devres group such that a device driver can easily release acquired resources halfway through initialization or selectively release resources (e.g. resources for port 1 out of 4 ports). This patch adds devres core including documentation and the following managed interfaces. * alloc/free : devm_kzalloc(), devm_kzfree() * IO region : devm_request_region(), devm_release_region() * IRQ : devm_request_irq(), devm_free_irq() * DMA : dmam_alloc_coherent(), dmam_free_coherent(), dmam_declare_coherent_memory(), dmam_pool_create(), dmam_pool_destroy() * PCI : pcim_enable_device(), pcim_pin_device(), pci_is_managed() * iomap : devm_ioport_map(), devm_ioport_unmap(), devm_ioremap(), devm_ioremap_nocache(), devm_iounmap(), pcim_iomap_table(), pcim_iomap(), pcim_iounmap() Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-02-09 17:39:36 -05:00
Ralf Baechle	7726942fb1	[APM] Add shared version of APM emulation Currently ARM and MIPS both have nearly identical copies of the APM emulation code in their arch code. Add yet another copy of it to drivers char and make it selectable through SYS_SUPPORTS_APM_EMULATION. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2007-02-09 17:08:57 +00:00
Linus Torvalds	78149df6d5	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (41 commits) Revert "PCI: remove duplicate device id from ata_piix" msi: Make MSI useable more architectures msi: Kill the msi_desc array. msi: Remove attach_msi_entry. msi: Fix msi_remove_pci_irq_vectors. msi: Remove msi_lock. msi: Kill msi_lookup_irq MSI: Combine pci_(save\|restore)_msi/msix_state MSI: Remove pci_scan_msi_device() MSI: Replace pci_msi_quirk with calls to pci_no_msi() PCI: remove duplicate device id from ipr PCI: remove duplicate device id from ata_piix PCI: power management: remove noise on non-manageable hw PCI: cleanup MSI code PCI: make isa_bridge Alpha-only PCI: remove quirk_sis_96x_compatible() PCI: Speed up the Intel SMBus unhiding quirk PCI Quirk: 1k I/O space IOBL_ADR fix on P64H2 shpchp: delete trailing whitespace shpchp: remove DBG_XXX_ROUTINE ...	2007-02-07 19:23:44 -08:00
Eric W. Biederman	5b912c108c	msi: Kill the msi_desc array. We need to be able to get from an irq number to a struct msi_desc. The msi_desc array in msi.c had several short comings the big one was that it could not be used outside of msi.c. Using irq_data in struct irq_desc almost worked except on some architectures irq_data needs to be used for something else. So this patch adds a msi_desc pointer to irq_desc, adds the appropriate wrappers and changes all of the msi code to use them. The dynamic_irq_init/cleanup code was tweaked to ensure the new field is left in a well defined state. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 15:50:08 -08:00
Kay Sievers	270a6c4cad	/sys/modules/*/holders /sys/module/usbcore/ \|-- drivers \| \|-- usb:hub -> ../../../subsystem/usb/drivers/hub \| \|-- usb:usb -> ../../../subsystem/usb/drivers/usb \| `-- usb:usbfs -> ../../../subsystem/usb/drivers/usbfs \|-- holders \| \|-- ehci_hcd -> ../../../module/ehci_hcd \| \|-- uhci_hcd -> ../../../module/uhci_hcd \| \|-- usb_storage -> ../../../module/usb_storage \| `-- usbhid -> ../../../module/usbhid \|-- initstate Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:12 -08:00
Greg Kroah-Hartman	fe480a2675	Modules: only add drivers/ direcory if needed This changes the module core to only create the drivers/ directory if we are going to put something in it. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:12 -08:00
Kay Sievers	f30c53a873	MODULES: add the module name for built in kernel drivers Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-02-07 10:37:12 -08:00
Al Viro	9abcf40b1d	[PATCH] fork_idle() should be __cpuinit, not __devinit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-01 16:17:06 -08:00
Masami Hiramatsu	ab40c5c6b6	[PATCH] kprobes: replace magic numbers with enum Replace the magic numbers with an enum, and gets rid of a warning on the specific architectures (ex. powerpc) on which the compiler considers 'char' as 'unsigned char'. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 16:01:35 -08:00
Serge E. Hallyn	0f2452855d	[PATCH] namespaces: fix task exit disaster This is based on a patch by Eric W. Biederman, who pointed out that pid namespaces are still fake, and we only have one ever active. So for the time being, we can modify any code which could access tsk->nsproxy->pid_ns during task exit to just use &init_pid_ns instead, and move the exit_task_namespaces call in do_exit() back above exit_notify(), so that an exiting nfs server has a valid tsk->sighand to work with. Long term, pulling pid_ns out of nsproxy might be the cleanest solution. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> [ Eric's patch fixed to take care of free_pid() too ] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 13:40:36 -08:00
Linus Torvalds	444f378b23	Revert "[PATCH] namespaces: fix exit race by splitting exit" This reverts commit `7a238fcba0` in preparation for a better and simpler fix proposed by Eric Biederman (and fixed up by Serge Hallyn) Acked-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 13:35:18 -08:00
Serge E. Hallyn	7a238fcba0	[PATCH] namespaces: fix exit race by splitting exit Fix exit race by splitting the nsproxy putting into two pieces. First piece reduces the nsproxy refcount. If we dropped the last reference, then it puts the mnt_ns, and returns the nsproxy as a hint to the caller. Else it returns NULL. The second piece of exiting task namespaces sets tsk->nsproxy to NULL, and drops the references to other namespaces and frees the nsproxy only if an nsproxy was passed in. A little awkward and should probably be reworked, but hopefully it fixes the NFS oops. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Daniel Hokka Zakrisson <daniel@hozac.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-30 08:26:44 -08:00
Linus Torvalds	8528b0f1de	Clear spurious irq stat information when adding irq handler Any newly added irq handler may obviously make any old spurious irq status invalid, since the new handler may well be the thing that is supposed to handle any interrupts that came in. So just clear the statistics when adding handlers. Pointed-out-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-23 14:16:31 -08:00
Ingo Molnar	1b5180b651	[PATCH] notifiers: fix blocking_notifier_call_chain() scalability while lock-profiling the -rt kernel i noticed weird contention during mmap-intense workloads, and the tracer showed the following gem, in one of our MM hotpaths: threaded-2771 1.... 65us : sys_munmap (sysenter_do_call) threaded-2771 1.... 66us : profile_munmap (sys_munmap) threaded-2771 1.... 66us : blocking_notifier_call_chain (profile_munmap) threaded-2771 1.... 66us : rt_down_read (blocking_notifier_call_chain) ouch! a global rw-semaphore taken in one of the most performance- sensitive codepaths of the kernel. And i dont even have oprofile enabled! All distro kernels have CONFIG_PROFILING enabled, so this scalability problem affects the majority of Linux users. The fix is to enhance blocking_notifier_call_chain() to only take the lock if there appears to be work on the call-chain. With this patch applied i get nicely saturated system, and much higher munmap performance, on SMP systems. And as a bonus this also fixes a similar scalability bottleneck in the thread-exit codepath: profile_task_exit() ... Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-23 11:08:03 -08:00
Andrew Morton	bbe1a59b3a	[PATCH] fix "kvm: add vm exit profiling" export profile_hits() on !SMP too. Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-01-23 07:52:05 -08:00
Ingo Molnar	07031e14c1	[PATCH] KVM: add VM-exit profiling This adds the profile=kvm boot option, which enables KVM to profile VM exits. Use: "readprofile -m ./System.map \| sort -n" to see the resulting output: [...] 18246 serial_out 148.3415 18945 native_flush_tlb 378.9000 23618 serial_in 212.7748 29279 __spin_unlock_irq 622.9574 43447 native_apic_write 2068.9048 52702 enable_8259A_irq 742.2817 54250 vgacon_scroll 89.3740 67394 ide_inb 6126.7273 79514 copy_page_range 98.1654 84868 do_wp_page 86.6000 140266 pit_read 783.6089 151436 ide_outb 25239.3333 152668 native_io_delay 21809.7143 174783 mask_and_ack_8259A 783.7803 362404 native_set_pte_at 36240.4000 1688747 total 0.5009 Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-11 18:18:21 -08:00
Gautham R Shenoy	b282b6f8a8	[PATCH] Change cpu_up and co from __devinit to __cpuinit Compiling the kernel with CONFIG_HOTPLUG = y and CONFIG_HOTPLUG_CPU = n with CONFIG_RELOCATABLE = y generates the following modpost warnings WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141b7d) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141b9c) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.text:__cpu_up from .text between '_cpu_up' (at offset 0xc0141bd8) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141c05) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141c26) and 'cpu_up' WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between '_cpu_up' (at offset 0xc0141c37) and 'cpu_up' This is because cpu_up, _cpu_up and __cpu_up (in some architectures) are defined as __devinit AND __cpu_up calls some __cpuinit functions. Since __cpuinit would map to __init with this kind of a configuration, we get a .text refering .init.data warning. This patch solves the problem by converting all of __cpu_up, _cpu_up and cpu_up from __devinit to __cpuinit. The approach is justified since the callers of cpu_up are either dependent on CONFIG_HOTPLUG_CPU or are of __init type. Thus when CONFIG_HOTPLUG_CPU=y, all these cpu up functions would land up in .text section, and when CONFIG_HOTPLUG_CPU=n, all these functions would land up in .init section. Tested on a i386 SMP machine running linux-2.6.20-rc3-mm1. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-11 18:18:20 -08:00
Nathan Lynch	e5e5673f82	[PATCH] sched: tasks cannot run on cpus onlined after boot Commit `5c1e176781` ("sched: force /sbin/init off isolated cpus") sets init's cpus_allowed to a subset of cpu_online_map at boot time, which means that tasks won't be scheduled on cpus that are added to the system later. Make init's cpus_allowed a subset of cpu_possible_map instead. This should still preserve the behavior that Nick's change intended. Thanks to Giuliano Pochini for reporting this and testing the fix: http://ozlabs.org/pipermail/linuxppc-dev/2006-December/029397.html Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-11 18:18:20 -08:00
Vivek Goyal	343cde51b3	[PATCH] x86-64: Make noirqdebug_setup function non init to fix modpost warning o noirqdebug_setup() is __init but it is being called by quirk_intel_irqbalance() which if of type __devinit. If CONFIG_HOTPLUG=y, quirk_intel_irqbalance() is put into text section and it is wrong to call a function in __init section. o MODPOST flags this on i386 if CONFIG_RELOCATABLE=y WARNING: vmlinux - Section mismatch: reference to .init.text:noirqdebug_setup from .text between 'quirk_intel_irqbalance' (at offset 0xc010969e) and 'i8237A_suspend' o Make noirqdebug_setup() non-init. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-01-11 01:52:44 +01:00
Linus Torvalds	d0abc451a6	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: [PATCH] Driver core: Fix prefix driver links in /sys/module by bus-name	2007-01-06 00:10:55 -08:00
Ingo Molnar	a75acf850c	[PATCH] profiling: fix sched profiling typo Fix sched profiling typo, introduced by the sleep profiling patch. This bug caused profile=sched to not work. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:22 -08:00
Rafael J. Wysocki	7bf2368742	[PATCH] swsusp: Do not fail if resume device is not set In the kernels later than 2.6.19 there is a regression that makes swsusp fail if the resume device is not explicitly specified. It can be fixed by adding an additional parameter to mm/swapfile.c:swap_type_of() allowing us to pass the (struct block_device *) corresponding to the first available swap back to the caller. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:22 -08:00
Ard van Breemen	a416aba637	[PATCH] kernelparams: detect if and which parameter parsing enabled irq's The parsing of some kernel parameters seem to enable irq's at a stage that irq's are not supposed to be enabled (Particularly the ide kernel parameters). Having irq's enabled before the irq controller is initialized might lead to a kernel panic. This patch only detects this behaviour and warns about wich parameter caused it. [akpm@osdl.org: cleanups] Signed-off-by: Ard van Breemen <ard@telegraafnet.nl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2007-01-05 23:55:21 -08:00
Kay Sievers	7f422e2e84	[PATCH] Driver core: Fix prefix driver links in /sys/module by bus-name Modules may have drivers with the same name on different buses. This patch fixes this problem. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-01-05 12:33:38 -08:00
Oleg Nesterov	241ceee0b4	[PATCH] restore ->pdeath_signal behaviour Commit `b2b2cbc4b2` introduced a user- visible change: ->pdeath_signal is sent only when the entire thread group exits. While this change is imho good, it may break things. So restore the old behaviour for now. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> To: Albert Cahalan <acahalan@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Linus Torvalds <torvalds@osdl.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Qi Yong <qiyong@fc-cn.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-31 14:41:18 -08:00
Andrew Morton	755cd90029	[PATCH] lockdep: printk warning fix kernel/lockdep.c: In function `lookup_chain_cache': kernel/lockdep.c:1339: warning: long long unsigned int format, u64 arg (arg 2) kernel/lockdep.c:1344: warning: long long unsigned int format, u64 arg (arg 2) Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-30 10:56:43 -08:00
Andrew Morton	089e34b600	[PATCH] cpuset procfs warning fix fs/proc/base.c:1869: warning: initialization discards qualifiers from pointer target type fs/proc/base.c:2150: warning: initialization discards qualifiers from pointer target type Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-30 10:56:43 -08:00
Akinobu Mita	a3f99f8ba8	[PATCH] module: fix mod_sysfs_setup() return value mod_sysfs_setup() doesn't return error when kobject_add_dir() failed. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-30 10:56:41 -08:00
Ingo Molnar	9414232fa0	[PATCH] sched: fix cond_resched_softirq() offset Remove the __resched_legal() check: it is conceptually broken. The biggest problem it had is that it can mask buggy cond_resched() calls. A cond_resched() call is only legal if we are not in an atomic context, with two narrow exceptions: - if the system is booting - a reacquire_kernel_lock() down() done while PREEMPT_ACTIVE is set But __resched_legal() hid this and just silently returned whenever these primitives were called from invalid contexts. (Same goes for cond_resched_locked() and cond_resched_softirq()). Furthermore, the __legal_resched(0) call was buggy in that it caused unnecessarily long softirq latencies via cond_resched_softirq(). (which is only called from softirq-off sections, hence the code did nothing.) The fix is to resurrect the efficiency of the might_sleep checks and to only allow the narrow exceptions. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-30 10:56:41 -08:00
Ingo Molnar	e4e6bdbb42	[PATCH] rcu: rcutorture suspend fix Fix suspend hang: rcutorture threads need to be nofreeze. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-30 10:55:55 -08:00
Ingo Molnar	e1d9fd2e3d	[PATCH] suspend: fix suspend on single-CPU systems Clark Williams reported that suspend doesnt work on his laptop on 2.6.20-rc1-rt kernels. The bug was introduced by the following cleanup commit: commit `112cecb2cc` Author: Siddha, Suresh B <suresh.b.siddha@intel.com> Date: Wed Dec 6 20:34:31 2006 -0800 [PATCH] suspend: don't change cpus_allowed for task initiating the suspend because with this change 'error' is not initialized to 0 anymore, if there are no other online CPUs. (i.e. if the system is single-CPU). the fix is the initialize it to 0. The really weird thing is that my version of gcc does not warn about this non-initialized variable situation ... (also fix the kernel printk in the error branch, it was missing a newline) Reported-by: Clark Williams <williams@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-23 13:59:33 -08:00
Linus Torvalds	18ed1c0513	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (68 commits) ACPI: replace kmalloc+memset with kzalloc ACPI: Add support for acpi_load_table/acpi_unload_table_id fbdev: update after backlight argument change ACPI: video: Add dev argument for backlight_device_register ACPI: Implement acpi_video_get_next_level() ACPI: Kconfig - depend on PM rather than selecting it ACPI: fix NULL check in drivers/acpi/osl.c ACPI: make drivers/acpi/ec.c:ec_ecdt static ACPI: prevent processor module from loading on failures ACPI: fix single linked list manipulation ACPI: ibm_acpi: allow clean removal ACPI: fix git automerge failure ACPI: ibm_acpi: respond to workqueue update ACPI: dock: add uevent to indicate change in device status ACPI: ec: Lindent once again ACPI: ec: Change #define to enums there possible. ACPI: ec: Style changes. ACPI: ec: Acquire Global Lock under EC mutex. ACPI: ec: Drop udelay() from poll mode. Loop by reading status field instead. ACPI: ec: Rename gpe_bit to gpe ...	2006-12-22 18:46:56 -08:00
Eric W. Biederman	b2b2cbc4b2	[PATCH] Fix reparenting to the same thread group. (take 2) This patch fixes the case when we reparent to a different thread in the same thread group. This modifies the code so that we do not send signals and do not change the signal to send to SIGCHLD unless we have change the thread group of our parents. It also suppresses sending pdeath_sig in this cas as well since the result of geppid doesn't change. Thanks to Oleg for spotting my bug of only fixing this for non-ptraced tasks. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Albert Cahalan <acahalan@gmail.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Coywolf Qi Hunt <qiyong@fc-cn.com> Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 09:03:41 -08:00
Andrew Morton	192636ad90	[PATCH] relay: remove inlining text data bss dec hex filename before: 4036 44 0 4080 ff0 kernel/relay.o after: 3727 44 0 3771 ebb kernel/relay.o Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:51 -08:00
Vadim Lobanov	01b2d93ca4	[PATCH] fdtable: Provide free_fdtable() wrapper Christoph Hellwig has expressed concerns that the recent fdtable changes expose the details of the RCU methodology used to release no-longer-used fdtable structures to the rest of the kernel. The trivial patch below addresses these concerns by introducing the appropriate free_fdtable() calls, which simply wrap the release RCU usage. Since free_fdtable() is a one-liner, it makes sense to promote it to an inline helper. Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:50 -08:00
Andrew Morton	5b149bcc23	[PATCH] schedule_timeout(): improve warning message Kyle is hitting this warning, and we don't have a clue what it's caused by. Add the obligatory dump_stack(). Cc: kyle <kylewong@southa.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:49 -08:00
Akinobu Mita	3e1fbd12c9	[PATCH] audit: fix kstrdup() error check kstrdup() returns NULL on error. Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:49 -08:00
Thomas Gleixner	9d7ac8be4b	[PATCH] genirq: fix irq flow handler uninstall The sanity check for no_irq_chip in __set_irq_hander() is unconditional on both install and uninstall of an handler. This triggers false warnings and replaces no_irq_chip by dummy_irq_chip in the uninstall case. Check only, when a real handler is installed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Sylvain Munaut <tnt@246tNt.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:48 -08:00
Tim Chen	67af63a6ab	[PATCH] sched: remove __cpuinitdata anotation to cpu_isolated_map The structure cpu_isolated_map is used not only during initialization. Multi-core scheduler configuration changes and exclusive cpusets use this during run time. During setting of sched_mc_power_savings policy, this structure is accessed to update sched_domains. Signed-off-by: Tim Chen <tim.c.chen@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:47 -08:00
Adrian Bunk	99eea6a105	[PATCH] make kernel/printk.c:ignore_loglevel_setup() static Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:47 -08:00
Randy Dunlap	af9997e426	[PATCH] fix kernel-doc warnings in 2.6.20-rc1 Fix kernel-doc warnings in 2.6.20-rc1. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:47 -08:00
Mark Fasheh	ba0084048a	[PATCH] Conditionally check expected_preempt_count in __resched_legal() Commit `2d7d253548` ("fix cond_resched() fix") introduced an 'expected_preempt_count' parameter to __resched_legal() to fix a bug where it was returning a false negative when called from cond_resched_lock() and preemption was enabled. Unfortunately this broke things for when preemption is disabled. preempt_count() will always return zero, thus failing the check against any value of expected_preempt_count not equal to zero. cond_resched_lock() for example, passes an expected_preempt_count value of 1. So fix the fix for the cond_resched() fix by skipping the check of preempt_count() against expected_preempt_count when preemption is disabled. Credit should go to Sunil Mushran for spotting the bug during testing. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:46 -08:00
Ingo Molnar	9bfb18392e	[PATCH] workqueue: fix schedule_on_each_cpu() fix the schedule_on_each_cpu() implementation: __queue_work() is now stricter, hence set the work-pending bit before passing in the new work. (found in the -rt tree, using Peter Zijlstra's files-lock scalability patchset) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-21 00:20:01 -08:00
Peter Williams	bc947631d1	[PATCH] sched: improve efficiency of sched_fork() Problem: sched_fork() has always called scheduler_tick() in some (unlikely) circumstances in order to update the current task in light of those circumstances. It has always been the case that the work done by scheduler_tick() was more than was required to handle the problem in hand but no harm was done except for the waste of a few CPU cycles. However, the splitting of scheduler_tick() into two procedures in 2.6.20-rc1 enables the wasted cycles to be saved as the new procedure task_running_tick() does all the work that is required to rectify the problem being handled. Solution: Replace the call to scheduler_tick() in sched_fork() with a call to task_running_tick(). Signed-off-by: Peter Williams <pwil3058@bigpond.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-21 00:11:51 -08:00
Geert Uytterhoeven	b039db8eea	[PATCH] __set_irq_handler bogus space __set_irq_handler: Kill a bogus space Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-21 00:08:27 -08:00
Len Brown	9774f33841	merge linus into test branch	2006-12-20 02:53:13 -05:00
Linus Torvalds	a08727bae7	Make workqueue bit operations work on "atomic_long_t" On architectures where the atomicity of the bit operations is handled by external means (ie a separate spinlock to protect concurrent accesses), just doing a direct assignment on the workqueue data field (as done by commit `4594bf159f`) can cause the assignment to be lost due to lack of serialization with the bitops on the same word. So we need to serialize the assignment with the locks on those architectures (notably older ARM chips, PA-RISC and sparc32). So rather than using an "unsigned long", let's use "atomic_long_t", which already has a safe assignment operation (atomic_long_set()) on such architectures. This requires that the atomic operations use the same atomicity locks as the bit operations do, but that is largely the case anyway. Sparc32 will probably need fixing. Architectures (including modern ARM with LL/SC) that implement sane atomic operations for SMP won't see any of this matter. Cc: Russell King <rmk+lkml@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: David Miller <davem@davemloft.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Linux Arch Maintainers <linux-arch@vger.kernel.org> Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-16 09:53:50 -08:00
Len Brown	cfee47f99b	Pull bugfix into test branch Conflicts: kernel/power/disk.c	2006-12-16 01:01:18 -05:00
Linus Torvalds	d1526e2cda	Remove stack unwinder for now It has caused more problems than it ever really solved, and is apparently not getting cleaned up and fixed. We can put it back when it's stable and isn't likely to make warning or bug events worse. In the meantime, enable frame pointers for more readable stack traces. Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-15 08:47:51 -08:00
David Brownell	f89bce3d9a	Driver core: deprecate PM_LEGACY, default it to N Deprecate the old "legacy" PM API, and more importantly default it to "n". Virtually nothing in-tree uses it any more. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-12-13 15:38:46 -08:00
Kay Sievers	1f71740ab9	Driver core: show "initstate" of module Show the initialization state(live, coming, going) of the module: $ cat /sys/module/usbcore/initstate live Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-12-13 15:38:45 -08:00
Ralf Baechle	ec8c0446b6	[PATCH] Optimize D-cache alias handling on fork Virtually index, physically tagged cache architectures can get away without cache flushing when forking. This patch adds a new cache flushing function flush_cache_dup_mm(struct mm_struct *) which for the moment I've implemented to do the same thing on all architectures except on MIPS where it's a no-op. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:27:08 -08:00
Eric Dumazet	cd7175edf9	[PATCH] Optimize calc_load() calc_load() is called by timer interrupt to update avenrun[]. It currently calls nr_active() at each timer tick (HZ per second), while the update of avenrun[] is done only once every 5 seconds. (LOAD_FREQ=5 Hz) nr_active() is quite expensive on SMP machines, since it has to sum up nr_running and nr_uninterruptible of all online CPUS, bringing foreign dirty cache lines. This patch is an optimization of calc_load() so that nr_active() is called only if we need it. The use of unlikely() is welcome since the condition is true only once every 5*HZ time. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:55 -08:00
Robert P. J. Day	cd86128088	[PATCH] Fix numerous kcalloc() calls, convert to kzalloc() All kcalloc() calls of the form "kcalloc(1,...)" are converted to the equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect ordering of the first two arguments are fixed. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Adam Belay <ambx1@neo.rr.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:52 -08:00
Ingo Molnar	74c383f140	[PATCH] lockdep: fix possible races while disabling lock-debugging Jarek Poplawski noticed that lockdep global state could be accessed in a racy way if one CPU did a lockdep assert (shutting lockdep down), while the other CPU would try to do something that changes its global state. This patch fixes those races and cleans up lockdep's internal locking by adding a graph_lock()/graph_unlock()/debug_locks_off_graph_unlock helpers. (Also note that as we all know the Linux kernel is, by definition, bug-free and perfect, so this code never triggers, so these fixes are highly theoretical. I wrote this patch for aesthetic reasons alone.) [akpm@osdl.org: build fix] [jarkao2@o2.pl: build fix's refix] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:50 -08:00
Ingo Molnar	3117df0453	[PATCH] lockdep: print irq-trace info on asserts When we print an assert due to scheduling-in-atomic bugs, and if lockdep is enabled, then the IRQ tracing information of lockdep can be printed to pinpoint the code location that disabled interrupts. This saved me quite a bit of debugging time in cases where the backtrace did not identify the irq-disabling site well enough. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:50 -08:00
Ingo Molnar	27c3b23226	[PATCH] lockdep: use chain hash on CONFIG_DEBUG_LOCKDEP too CONFIG_DEBUG_LOCKDEP is unacceptably slow because it does not utilize the chain-hash. Turn the chain-hash back on in this case too. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:50 -08:00
Ingo Molnar	33e94e960b	[PATCH] lockdep: clean up VERY_VERBOSE define Cleanup: the VERY_VERBOSE define was unnecessarily dependent on #ifdef VERBOSE - while the VERBOSE switch is 0 or 1 (always defined). Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:50 -08:00
Ingo Molnar	23d95a03d6	[PATCH] lockdep: improve lockdep_reset() Clear all the chains during lockdep_reset(). This fixes some locking-selftest false positives i saw on -rt. (never saw those on mainline though, but it could happen.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:50 -08:00
Ingo Molnar	81fc685a89	[PATCH] lockdep: improve verbose messages Make verbose lockdep messages (off by default) more informative by printing out the hash chain key. (this patch was what helped me catch the earlier lockdep hash-collision bug) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:50 -08:00
Ingo Molnar	a664089741	[PATCH] lockdep: filter off by default Fix typo in the class_filter() function. (filtering is not used by default so this only affects lockdep-internal debugging cases) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:50 -08:00
Ingo Molnar	5d6f647fc6	[PATCH] debug: add sysrq_always_enabled boot option Most distributions enable sysrq support but set it to 0 by default. Add a sysrq_always_enabled boot option to always-enable sysrq keys. Useful for debugging - without having to modify the disribution's config files (which might not be possible if the kernel is on a live CD, etc.). Also, while at it, clean up the sysrq interfaces. [bunk@stusta.de: make sysrq_always_enabled_setup() static] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:50 -08:00
Rafael J. Wysocki	8a102eed9c	[PATCH] PM: Fix SMP races in the freezer Currently, to tell a task that it should go to the refrigerator, we set the PF_FREEZE flag for it and send a fake signal to it. Unfortunately there are two SMP-related problems with this approach. First, a task running on another CPU may be updating its flags while the freezer attempts to set PF_FREEZE for it and this may leave the task's flags in an inconsistent state. Second, there is a potential race between freeze_process() and refrigerator() in which freeze_process() running on one CPU is reading a task's PF_FREEZE flag while refrigerator() running on another CPU has just set PF_FROZEN for the same task and attempts to reset PF_FREEZE for it. If the refrigerator wins the race, freeze_process() will state that PF_FREEZE hasn't been set for the task and will set it unnecessarily, so the task will go to the refrigerator once again after it's been thawed. To solve first of these problems we need to stop using PF_FREEZE to tell tasks that they should go to the refrigerator. Instead, we can introduce a special TIF_* flag and use it for this purpose, since it is allowed to change the other tasks' TIF_* flags and there are special calls for it. To avoid the freeze_process()-refrigerator() race we can make freeze_process() to always check the task's PF_FROZEN flag after it's read its "freeze" flag. We should also make sure that refrigerator() will always reset the task's "freeze" flag after it's set PF_FROZEN for it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@muc.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:49 -08:00
Rafael J. Wysocki	3df494a32b	[PATCH] PM: Fix freezing of stopped tasks Currently, if a task is stopped (ie. it's in the TASK_STOPPED state), it is considered by the freezer as unfreezeable. However, there may be a race between the freezer and the delivery of the continuation signal to the task resulting in the task running after we have finished freezing the other tasks. This, in turn, may lead to undesirable effects up to and including data corruption. To prevent this from happening we first need to make the freezer consider stopped tasks as freezeable. For this purpose we need to make freezeable() stop returning 0 for these tasks and we need to force them to enter the refrigerator. However, if there's no continuation signal in the meantime, the stopped tasks should remain stopped after all processes have been thawed, so we need to send an additional SIGSTOP to each of them before waking it up. Also, a stopped task that has just been woken up should first check if there's a freezing request for it and go to the refrigerator if that's the case. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:49 -08:00
Paul Jackson	02a0e53d82	[PATCH] cpuset: rework cpuset_zone_allowed api Elaborate the API for calling cpuset_zone_allowed(), so that users have to explicitly choose between the two variants: cpuset_zone_allowed_hardwall() cpuset_zone_allowed_softwall() Until now, whether or not you got the hardwall flavor depended solely on whether or not you or'd in the __GFP_HARDWALL gfp flag to the gfp_mask argument. If you didn't specify __GFP_HARDWALL, you implicitly got the softwall version. Unfortunately, this meant that users would end up with the softwall version without thinking about it. Since only the softwall version might sleep, this led to bugs with possible sleeping in interrupt context on more than one occassion. The hardwall version requires that the current tasks mems_allowed allows the node of the specified zone (or that you're in interrupt or that __GFP_THISNODE is set or that you're on a one cpuset system.) The softwall version, depending on the gfp_mask, might allow a node if it was allowed in the nearest enclusing cpuset marked mem_exclusive (which requires taking the cpuset lock 'callback_mutex' to evaluate.) This patch removes the cpuset_zone_allowed() call, and forces the caller to explicitly choose between the hardwall and the softwall case. If the caller wants the gfp_mask to determine this choice, they should (1) be sure they can sleep or that __GFP_HARDWALL is set, and (2) invoke the cpuset_zone_allowed_softwall() routine. This adds another 100 or 200 bytes to the kernel text space, due to the few lines of nearly duplicate code at the top of both cpuset_zone_allowed_* routines. It should save a few instructions executed for the calls that turned into calls of cpuset_zone_allowed_hardwall, thanks to not having to set (before the call) then check (within the call) the __GFP_HARDWALL flag. For the most critical call, from get_page_from_freelist(), the same instructions are executed as before -- the old cpuset_zone_allowed() routine it used to call is the same code as the cpuset_zone_allowed_softwall() routine that it calls now. Not a perfect win, but seems worth it, to reduce this chance of hitting a sleeping with irq off complaint again. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:49 -08:00
Eric W. Biederman	5f8442edfb	[PATCH] Revert "[PATCH] identifier to nsproxy" This reverts commit `373beb35cd`. No one is using this identifier yet. The purpose of this identifier is to export nsproxy to user space which is wrong. nsproxy is an internal implementation optimization, which should keep our fork times from getting slower as we increase the number of global namespaces you don't have to share. Adding a global identifier like this is inappropriate because it makes namespaces inherently non-recursive, greatly limiting what we can do with them in the future. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-13 09:05:47 -08:00
Daniel Walker	f5f1a24a2c	[PATCH] clocksource: small cleanup Mostly changing alignment. Just some general cleanup. [akpm@osdl.org: build fix] Signed-off-by: Daniel Walker <dwalker@mvista.com> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:22 -08:00
Daniel Walker	2b0137001d	[PATCH] clocksource: add usage of CONFIG_SYSFS Simply adds some ifdefs to remove clocksoure sysfs code when CONFIG_SYSFS isn't turn on. Signed-off-by: Daniel Walker <dwalker@mvista.com> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:22 -08:00
Arjan van de Ven	4c36a5dec2	[PATCH] round_jiffies infrastructure Introduce a round_jiffies() function as well as a round_jiffies_relative() function. These functions round a jiffies value to the next whole second. The primary purpose of this rounding is to cause all "we don't care exactly when" timers to happen at the same jiffy. This avoids multiple timers firing within the second for no real reason; with dynamic ticks these extra timers cause wakeups from deep sleep CPU sleep states and thus waste power. The exact wakeup moment is skewed by the cpu number, to avoid all cpus from waking up at the exact same time (and hitting the same lock/cachelines there) [akpm@osdl.org: fix variable type] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:22 -08:00
Vadim Lobanov	4fd45812cb	[PATCH] fdtable: Remove the free_files field An fdtable can either be embedded inside a files_struct or standalone (after being expanded). When an fdtable is being discarded after all RCU references to it have expired, we must either free it directly, in the standalone case, or free the files_struct it is contained within, in the embedded case. Currently the free_files field controls this behavior, but we can get rid of it entirely, as all the necessary information is already recorded. We can distinguish embedded and standalone fdtables using max_fds, and if it is embedded we can divine the relevant files_struct using container_of(). Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:22 -08:00
Vadim Lobanov	bbea9f6966	[PATCH] fdtable: Make fdarray and fdsets equal in size Currently, each fdtable supports three dynamically-sized arrays of data: the fdarray and two fdsets. The code allows the number of fds supported by the fdarray (fdtable->max_fds) to differ from the number of fds supported by each of the fdsets (fdtable->max_fdset). In practice, it is wasteful for these two sizes to differ: whenever we hit a limit on the smaller-capacity structure, we will reallocate the entire fdtable and all the dynamic arrays within it, so any delta in the memory used by the larger-capacity structure will never be touched at all. Rather than hogging this excess, we shouldn't even allocate it in the first place, and keep the capacities of the fdarray and the fdsets equal. This patch removes fdtable->max_fdset. As an added bonus, most of the supporting code becomes simpler. Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:22 -08:00
Vadim Lobanov	f3d19c90fb	[PATCH] fdtable: Delete pointless code in dup_fd() The dup_fd() function creates a new files_struct and fdtable embedded inside that files_struct, and then possibly expands the fdtable using expand_files(). The out_release error path is invoked when expand_files() returns an error code. However, when this attempt to expand fails, the fdtable is left in its original embedded form, so it is pointless to try to free the associated fdarray and fdsets. Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:21 -08:00
Miguel Ojeda Sandonis	33859f7f97	[PATCH] kernel/sched.c: whitespace cleanups [akpm@osdl.org: additional cleanups] Signed-off-by: Miguel Ojeda Sandonis <maxextreme@gmail.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:57:20 -08:00
Chen, Kenneth W	62ab616d54	[PATCH] sched: optimize activate_task for RT task RT task does not participate in interactiveness priority and thus shouldn't be bothered with timestamp and p->sleep_type manipulation when task is being put on run queue. Bypass all of the them with a single if (rt_task) test. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Chen, Kenneth W	06066714f6	[PATCH] sched: remove lb_stopbalance counter Remove scheduler stats lb_stopbalance counter. This counter can be calculated by: lb_balanced - lb_nobusyg - lb_nobusyq. There is no need to create gazillion counters while we can derive the value. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Siddha, Suresh B	783609c6cb	[PATCH] sched: decrease number of load balances Currently at a particular domain, each cpu in the sched group will do a load balance at the frequency of balance_interval. More the cores and threads, more the cpus will be in each sched group at SMP and NUMA domain. And we endup spending quite a bit of time doing load balancing in those domains. Fix this by making only one cpu(first idle cpu or first cpu in the group if all the cpus are busy) in the sched group do the load balance at that particular sched domain and this load will slowly percolate down to the other cpus with in that group(when they do load balancing at lower domains). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Christoph Lameter <clameter@engr.sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Mike Galbraith	b18ec80396	[PATCH] sched: improve migration accuracy Co-opt rq->timestamp_last_tick to maintain a cache_hot_time evaluation reference timestamp at both tick and sched times to prevent said reference, formerly rq->timestamp_last_tick, from being behind task->last_ran at evaluation time, and to move said reference closer to current time on the remote processor, intent being to improve cache hot evaluation and timestamp adjustment accuracy for task migration. Fix minor sched_time double accounting error which occurs when a task passing through schedule() does not schedule off, and takes the next timer tick. [kenneth.w.chen@intel.com: cleanup] Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Ken Chen <kenneth.w.chen@intel.com> Cc: Don Mullis <dwm@meer.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Christoph Lameter	08c183f31b	[PATCH] sched: add option to serialize load balancing Large sched domains can be very expensive to scan. Add an option SD_SERIALIZE to the sched domain flags. If that flag is set then we make sure that no other such domain is being balanced. [akpm@osdl.org: build fix] Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Christoph Lameter	1bd77f2da5	[PATCH] sched: call tasklet less frequently Trigger softirq less frequently We trigger the softirq before this patch using offset of sd->interval. However, if the queue is busy then it is sufficient to schedule the softirq with sd->interval * busy_factor. So we modify the calculation of the next time to balance by taking the interval added to last_balance again. This is only the right value if the idle/busy situation continues as is. There are two potential trouble spots: - If the queue was idle and now gets busy then we call rebalance early. However, that is not a problem because we will then use the longer interval for the next period. - If the queue was busy and becomes idle then we potentially wait too long before rebalancing. However, when the task goes idle then idle_balance is called. We add another calculation of the next balance time based on sd->interval in idle_balance so that we will rebalance soon. V2->V3: - Calculate rebalance time based on current jiffies and not based on the jiffies at the last time we load balanced. We no longer rely on staggering and therefore we can affort to do this now. V3->V4: - Use functions to do jiffy comparisons. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:43 -08:00
Christoph Lameter	c9819f4593	[PATCH] sched: use softirq for load balancing Call rebalance_tick (renamed to run_rebalance_domains) from a newly introduced softirq. We calculate the earliest time for each layer of sched domains to be rescanned (this is the rescan time for idle) and use the earliest of those to schedule the softirq via a new field "next_balance" added to struct rq. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Christoph Lameter	e418e1c2bf	[PATCH] sched: move idle status calculation into rebalance_tick() Perform the idle state determination in rebalance_tick. If we separate balancing from sched_tick then we also need to determine the idle state in rebalance_tick. V2->V3 Remove useless idlle != 0 check. Checking nr_running seems to be sufficient. Thanks Suresh. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Christoph Lameter	7835b98bc6	[PATCH] sched: extract load calculation from rebalance_tick A load calculation is always done in rebalance_tick() in addition to the real load balancing activities that only take place when certain jiffie counts have been reached. Move that processing into a separate function and call it directly from scheduler_tick(). Also extract the time slice handling from scheduler_tick and put it into a separate function. Then we can clean up scheduler_tick significantly. It will no longer have any gotos. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Christoph Lameter	fe2eea3faf	[PATCH] sched: disable interrupts for locking in load_balance() Interrupts must be disabled for request queue locks if we want to run load_balance() with interrupts enabled. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Christoph Lameter	4211a9a2e9	[PATCH] sched: remove staggering of load balancing Timer interrupts already are staggered. We do not need an additional layer of time staggering for short load balancing actions that take a reasonably small portion of the time slice. For load balancing on large sched_domains we will add a serialization later that avoids concurrent load balance operations and thus has the same effect as load staggering. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Christoph Lameter	571f6d2fb0	[PATCH] sched: avoid taking rq lock in wake_priority_sleeper Avoid taking the request queue lock in wake_priority_sleeper if there are no running processes. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Peter Williams <pwil3058@bigpond.net.au> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Kirill Korotaev	054b9108e0	[PATCH] move_task_off_dead_cpu() should be called with disabled ints move_task_off_dead_cpu() requires interrupts to be disabled, while migrate_dead() calls it with enabled interrupts. Added appropriate comments to functions and added BUG_ON(!irqs_disabled()) into double_rq_lock() and double_lock_balance() which are the origin sources of such bugs. Signed-off-by: Kirill Korotaev <dev@openvz.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Siddha, Suresh B	6711cab43e	[PATCH] ched domain: move sched group allocations to percpu area Move the sched group allocations to percpu area. This will minimize cross node memory references and also cleans up the sched groups allocation for allnodes sched domain. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Robert P. J. Day	cc2a73b5ca	[PATCH] sched.c: correct comment for this_rq_lock() Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:42 -08:00
Andrew Morton	4a7864ca63	[PATCH] io-accounting: via taskstats Deliver IO accounting via taskstats. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Andrew Morton	7c3ab7381e	[PATCH] io-accounting: core statistics The present per-task IO accounting isn't very useful. It simply counts the number of bytes passed into read() and write(). So if a process reads 1MB from an already-cached file, it is accused of having performed 1MB of I/O, which is wrong. (David Wright had some comments on the applicability of the present logical IO accounting: For billing purposes it is useless but for workload analysis it is very useful read_bytes/read_calls average read request size write_bytes/write_calls average write request size read_bytes/read_blocks ie logical/physical can indicate hit rate or thrashing write_bytes/write_blocks ie logical/physical guess since pdflush writes can be missed I often look for logical larger than physical to see filesystem cache problems. And the bytes/cpusec can help find applications that are dominating the cache and causing slow interactive response from page cache contention. I want to find the IO intensive applications and make sure they are doing efficient IO. Thus the acctcms(sysV) or csacms command would give the high IO commands). This patchset adds new accounting which tries to be more accurate. We account for three things: reads: attempt to count the number of bytes which this process really did cause to be fetched from the storage layer. Done at the submit_bio() level, so it is accurate for block-backed filesystems. I also attempt to wire up NFS and CIFS. writes: attempt to count the number of bytes which this process caused to be sent to the storage layer. This is done at page-dirtying time. The big inaccuracy here is truncate. If a process writes 1MB to a file and then deletes the file, it will in fact perform no writeout. But it will have been accounted as having caused 1MB of write. So... cancelled_writes: account the number of bytes which this process caused to not happen, by truncating pagecache. We _could_ just subtract this from the process's `write' accounting. But that means that some processes would be reported to have done negative amounts of write IO, which is silly. So we just report the raw number and punt this decision up to userspace. Now, we _could_ account for writes at the physical I/O level. But - This would require that we track memory-dirtying tasks at the per-page level (would require a new pointer in struct page). - It would mean that IO statistics for a process are usually only available long after that process has exitted. Which means that we probably cannot communicate this info via taskstats. This patch: Wire up the kernel-private data structures and the accessor functions to manipulate them. Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Alexey Dobriyan	1f29bcd739	[PATCH] sysctl: remove unused "context" param Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andi Kleen <ak@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: David Howells <dhowells@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Alexey Dobriyan	98d7340c36	[PATCH] sysctl: remove some OPs kernel.cap-bound uses only OP_SET and OP_AND Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:40 -08:00
Randy Dunlap	d53ef07ab4	[PATCH] ipc-procfs-sysctl mixups When CONFIG_PROC_FS=n and CONFIG_PROC_SYSCTL=n but CONFIG_SYSVIPC=y, we get this build error: kernel/built-in.o:(.data+0xc38): undefined reference to `proc_ipc_doulongvec_minmax' kernel/built-in.o:(.data+0xc88): undefined reference to `proc_ipc_doulongvec_minmax' kernel/built-in.o:(.data+0xcd8): undefined reference to `proc_ipc_dointvec' kernel/built-in.o:(.data+0xd28): undefined reference to `proc_ipc_dointvec' kernel/built-in.o:(.data+0xd78): undefined reference to `proc_ipc_dointvec' kernel/built-in.o:(.data+0xdc8): undefined reference to `proc_ipc_dointvec' kernel/built-in.o:(.data+0xe18): undefined reference to `proc_ipc_dointvec' make: *** [vmlinux] Error 1 Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:39 -08:00
David Howells	4594bf159f	[PATCH] WorkStruct: Use direct assignment rather than cmpxchg() Use direct assignment rather than cmpxchg() as the latter is unavailable and unimplementable on some platforms and is actually unnecessary. The use of cmpxchg() was to guard against two possibilities, neither of which can actually occur: (1) The pending flag may have been unset or may be cleared. However, given where it's called, the pending flag is _always_ set. I don't think it can be unset whilst we're in set_wq_data(). Once the work is enqueued to be actually run, the only way off the queue is for it to be actually run. If it's a delayed work item, then the bit can't be cleared by the timer because we haven't started the timer yet. Also, the pending bit can't be cleared by cancelling the delayed work _until_ the work item has had its timer started. (2) The workqueue pointer might change. This can only happen in two cases: (a) The work item has just been queued to actually run, and so we're protected by the appropriate workqueue spinlock. (b) A delayed work item is being queued, and so the timer hasn't been started yet, and so no one else knows about the work item or can access it (the pending bit protects us). Besides, set_wq_data() _sets_ the workqueue pointer unconditionally, so it can be assigned instead. So, replacing the set_wq_data() with a straight assignment would be okay in most cases. The problem is where we end up tangling with test_and_set_bit() emulated using spinlocks, and even then it's not a problem _provided_ test_and_set_bit() doesn't attempt to modify the word if the bit was set. If that's a problem, then a bitops-proofed assignment will be required - equivalent to atomic_set() vs other atomic_xxx() ops. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-09 12:25:08 -08:00
Eric W. Biederman	6b49a25785	[PATCH] sysctl: fix sys_sysctl interface of ipc sysctls Currently there is a regression and the ipc sysctls don't show up in the binary sysctl namespace. This patch adds sysctl_ipc_data to read data/write from the appropriate namespace and deliver it in the expected manner. [akpm@osdl.org: warning fix] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:03 -08:00
Eric W. Biederman	9bc9a6bd3c	[PATCH] sysctl: simplify ipc ns specific sysctls Refactor the ipc sysctl support so that it is simpler, more readable, and prepares for fixing the bug with the wrong values being returned in the sys_sysctl interface. The function proc_do_ipc_string() was misnamed as it never handled strings. It's magic of when to work with strings and when to work with longs belonged in the sysctl table. I couldn't tell if the code would work if you disabled the ipc namespace but it certainly looked like it would have problems. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:03 -08:00
Eric W. Biederman	c4b8b769fa	[PATCH] sysctl: implement sysctl_uts_string() The problem: When using sys_sysctl we don't read the proper values for the variables exported from the uts namespace, nor do we do the proper locking. This patch introduces sysctl_uts_string which properly fetches the values and does the proper locking. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:03 -08:00
Eric W. Biederman	cf9f151c72	[PATCH] sysctl: simplify sysctl_uts_string The binary interface to the namespace sysctls was never implemented resulting in some really weird things if you attempted to use sys_sysctl to read your hostname for example. This patch series simples the code a little and implements the binary sysctl interface. In testing this patch series I discovered that our 32bit compatibility for the binary sysctl interface is imperfect. In particular KERN_SHMMAX and KERN_SMMALL are size_t sized quantities and are returned as 8 bytes on to 32bit binaries using a x86_64 kernel. However this has existing for a long time so it is not a new regression with the namespace work. Gads the whole sysctl thing needs work before it stops being easy to shoot yourself in the foot. Looking forward a little bit we need a better way to handle sysctls and namespaces as our current technique will not work for the network namespace. I think something based on the current overlapping sysctl trees will work but the proc side needs to be redone before we can use it. This patch: Introduce get_uts() and put_uts() (used later) and remove most of the special cases for when UTS namespace is compiled in. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:03 -08:00
Oleg Nesterov	62dfb5541a	[PATCH] session_of_pgrp: kill unnecessary do_each_task_pid(PIDTYPE_PGID) All members of the process group have the same sid and it can't be == 0. NOTE: this code (and a similar one in sys_setpgid) was needed because it was possibe to have ->session == 0. It's not possible any longer since [PATCH] pidhash: don't use zero pids Commit: `c7c6464117` Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Oleg Nesterov	f020bc468f	[PATCH] sys_setpgid: eliminate unnecessary do_each_task_pid(PIDTYPE_PGID) All tasks in the process group have the same sid, we don't need to iterate them all to check that the caller of sys_setpgid() doesn't change its session. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Sukadev Bhattiprolu	84d737866e	[PATCH] add child reaper to pid_namespace Add a per pid_namespace child-reaper. This is needed so processes are reaped within the same pid space and do not spill over to the parent pid space. Its also needed so containers preserve existing semantic that pid == 1 would reap orphaned children. This is based on Eric Biederman's patch: http://lkml.org/lkml/2006/2/6/285 Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Cedric Le Goater	6cc1b22a4a	[PATCH] use current->nsproxy->pid_ns Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Cedric Le Goater	9a575a92db	[PATCH] to nsproxy Add the pid namespace framework to the nsproxy object. The copy of the pid namespace only increases the refcount on the global pid namespace, init_pid_ns, and unshare is not implemented. There is no configuration option to activate or deactivate this feature because this not relevant for the moment. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Sukadev Bhattiprolu	61a58c6c23	[PATCH] rename struct pspace to struct pid_namespace Rename struct pspace to struct pid_namespace for consistency with other namespaces (uts_namespace and ipc_namespace). Also rename include/linux/pspace.h to include/linux/pid_namespace.h and variables from pspace to pid_ns. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Cedric Le Goater	373beb35cd	[PATCH] identifier to nsproxy Add an identifier to nsproxy. The default init_ns_proxy has identifier 0 and allocated nsproxies are given -1. This identifier will be used by a new syscall sys_bind_ns. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:52 -08:00
Kirill Korotaev	6b3286ed11	[PATCH] rename struct namespace to struct mnt_namespace Rename 'struct namespace' to 'struct mnt_namespace' to avoid confusion with other namespaces being developped for the containers : pid, uts, ipc, etc. 'namespace' variables and attributes are also renamed to 'mnt_ns' Signed-off-by: Kirill Korotaev <dev@sw.ru> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:51 -08:00
Cedric Le Goater	1ec320afdc	[PATCH] add process_session() helper routine: deprecate old field Add an anonymous union and ((deprecated)) to catch direct usage of the session field. [akpm@osdl.org: fix various missed conversions] [jdike@addtoit.com: fix UML bug] Signed-off-by: Jeff Dike <jdike@addtoit.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:51 -08:00
Cedric Le Goater	937949d9ed	[PATCH] add process_session() helper routine Replace occurences of task->signal->session by a new process_session() helper routine. It will be useful for pid namespaces to abstract the session pid number. Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:51 -08:00
Josef Sipek	a7a005fd12	[PATCH] struct path: convert kernel Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:46 -08:00
Josef "Jeff" Sipek	f3a43f3f64	[PATCH] kernel: change uses of f_{dentry, vfsmnt} to use f_path Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in linux/kernel/. Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:42 -08:00
NeilBrown	d63a5a74de	[PATCH] lockdep: avoid lockdep warning in md md_open takes ->reconfig_mutex which causes lockdep to complain. This (normally) doesn't have deadlock potential as the possible conflict is with a reconfig_mutex in a different device. I say "normally" because if a loop were created in the array->member hierarchy a deadlock could happen. However that causes bigger problems than a deadlock and should be fixed independently. So we flag the lock in md_open as a nested lock. This requires defining mutex_lock_interruptible_nested. Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:39 -08:00
Oleg Nesterov	dae3c5a0b7	[PATCH] sys_unshare: remove a broken CLONE_SIGHAND code sys_unshare(CLONE_SIGHAND) is broken, the code under 'if (new_sigh)' is never executed but very wrong. Just remove it to avoid a confusion, task_lock() has nothing to do with ->sighand changing. Also, change the comment in unshare_sighand(). Yes, CLONE_THREAD implies CLONE_SIGHAND, but still it looks confusing. Also, we don't need to check current->sighand != NULL. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:38 -08:00
Oleg Nesterov	ae424ae4b5	[PATCH] make set_special_pids() static Make set_special_pids() static, the only caller is daemonize(). Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:38 -08:00
Oleg Nesterov	7bcfa95e56	[PATCH] do_acct_process(): don't take tty_mutex No need to take the global tty_mutex, signal->tty->driver can't go away while we are holding ->siglock. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:38 -08:00
Peter Zijlstra	24ec839c43	[PATCH] tty: ->signal->tty locking Fix the locking of signal->tty. Use ->sighand->siglock to protect ->signal->tty; this lock is already used by most other members of ->signal/->sighand. And unless we are 'current' or the tasklist_lock is held we need ->siglock to access ->signal anyway. (NOTE: sys_unshare() is broken wrt ->sighand locking rules) Note that tty_mutex is held over tty destruction, so while holding tty_mutex any tty pointer remains valid. Otherwise the lifetime of ttys are governed by their open file handles. This leaves some holes for tty access from signal->tty (or any other non file related tty access). It solves the tty SLAB scribbles we were seeing. (NOTE: the change from group_send_sig_info to __group_send_sig_info needs to be examined by someone familiar with the security framework, I think it is safe given the SEND_SIG_PRIV from other __group_send_sig_info invocations) [schwidefsky@de.ibm.com: 3270 fix] [akpm@osdl.org: various post-viro fixes] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Alan Cox <alan@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Roland McGrath <roland@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:38 -08:00
Hidetoshi Seto	6e2ac66470	[PATCH] CPEI gets warning at kernel/irq/migration.c:27/move_masked_irq() While running my MCA test (hardware error injection) on 2.6.19, I got some warning like following: > BUG: warning at kernel/irq/migration.c:27/move_masked_irq() > > Call Trace: > [<a000000100013d20>] show_stack+0x40/0xa0 > sp=e00000006b2578d0 bsp=e00000006b2510b0 > [<a000000100013db0>] dump_stack+0x30/0x60 > sp=e00000006b257aa0 bsp=e00000006b251098 > [<a0000001000de430>] move_masked_irq+0xb0/0x240 > sp=e00000006b257aa0 bsp=e00000006b251070 > [<a0000001000de6a0>] move_native_irq+0xe0/0x180 > sp=e00000006b257aa0 bsp=e00000006b251040 > [<a00000010004ff50>] iosapic_end_level_irq+0x30/0xe0 > sp=e00000006b257aa0 bsp=e00000006b251020 > [<a0000001000d94d0>] __do_IRQ+0x170/0x400 > sp=e00000006b257aa0 bsp=e00000006b250fd8 > [<a0000001000116f0>] ia64_handle_irq+0x1b0/0x260 > sp=e00000006b257aa0 bsp=e00000006b250fa8 > [<a00000010000c3a0>] ia64_leave_kernel+0x0/0x280 > sp=e00000006b257aa0 bsp=e00000006b250fa8 > [<a000000100690cf0>] _spin_unlock_irqrestore+0x30/0x60 > sp=e00000006b257c70 bsp=e00000006b250f90 It comes from: [kernel/irq/migration.c] 26 if (CHECK_IRQ_PER_CPU(desc->status)) { 27 WARN_ON(1); 28 return; 29 } By putting some printk in kernel, I found that irqbalance is trying to move CPEI which is handled as PER_CPU irq. That's why. CPEI(Corrected Platform Error Interrupt) is ia64 specific irq, is allowed to pin to particular processor which selected by the platform, and even it is PER_CPU but it has set_affinity handler (=iosapic_set_affinity) as same as other IO-SAPIC-level interrupts. (I don't know why, but I guess that there would be typical situation where the handler for migration is needed, such as hotplug - the processor going to be offline/hot-removed.) To shut up this warning, there are 2 way at least: a) fix CPEI stuff b) prohibit setting affinity to PER_CPU irq I'm not sure what stuff of CPEI need to be fixed, but I think that returning error to attempting move PER_CPU irq is useful for all applications since it will never work. Following small patch takes b) style. It works, the warning disappeared and irqbalance still runs well. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Arjan van de Ven <arjan@infradead.org> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:37 -08:00
Jan Beulich	aad094701c	[PATCH] move kallsyms data to .rodata Kallsyms data is never written to, so it can as well benefit from CONFIG_DEBUG_RODATA. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:37 -08:00
Linus Torvalds	6ee7e78e7c	Merge branch 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] replace kmalloc+memset with kzalloc [IA64] resolve name clash by renaming is_available_memory() [IA64] Need export for csum_ipv6_magic [IA64] Fix DISCONTIGMEM without VIRTUAL_MEM_MAP [PATCH] Add support for type argument in PAL_GET_PSTATE [IA64] tidy up return value of ip_fast_csum [IA64] implement csum_ipv6_magic for ia64. [IA64] More Itanium PAL spec updates [IA64] Update processor_info features [IA64] Add se bit to Processor State Parameter structure [IA64] Add dp bit to cache and bus check structs [IA64] SN: Correctly update smp_affinty mask [IA64] sparse cleanups [IA64] IA64 Kexec/kdump	2006-12-07 15:39:22 -08:00
Zou Nan hai	a79561134f	[IA64] IA64 Kexec/kdump Changes and updates. 1. Remove fake rendz path and related code according to discuss with Khalid Aziz. 2. fc.i offset fix in relocate_kernel.S. 3. iospic shutdown code eoi and mask race fix from Fujitsu. 4. Warm boot hook in machine_kexec to SN SAL code from Jack Steiner. 5. Send slave to SAL slave loop patch from Jay Lan. 6. Kdump on non-recoverable MCA event patch from Jay Lan 7. Use CTL_UNNUMBERED in kdump_on_init sysctl. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-12-07 09:51:35 -08:00
Linus Torvalds	68380b5813	Add "run_scheduled_work()" workqueue function This allows workqueue users to run just their own pending work, rather than wait for the whole workqueue to finish running. This solves the deadlock with networking libphy that was due to other workqueue entries possibly needing a lock that was held by the routine that wanted to flush its own work. It's not wonderful: if you absolutely need to synchronize with the work function having been executed, any user strictly speaking should have its own completion tracking logic, since when we run things explicitly by hand, the generic workqueue layer can no longer help us synchronize. Also, this is strictly only usable for work that has been scheduled without any delayed timers. You can not mix the new interface with schedule_delayed_work(). But it's better than what we had currently. Acked-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 09:28:19 -08:00
Linus Torvalds	2685b267bc	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (48 commits) [NETFILTER]: Fix non-ANSI func. decl. [TG3]: Identify Serdes devices more clearly. [TG3]: Use msleep. [TG3]: Use netif_msg_*. [TG3]: Allow partial speed advertisement. [TG3]: Add TG3_FLG2_IS_NIC flag. [TG3]: Add 5787F device ID. [TG3]: Fix Phy loopback. [WANROUTER]: Kill kmalloc debugging code. [TCP] inet_twdr_hangman: Delete unnecessary memory barrier(). [NET]: Memory barrier cleanups [IPSEC]: Fix inetpeer leak in ipv4 xfrm dst entries. audit: disable ipsec auditing when CONFIG_AUDITSYSCALL=n audit: Add auditing to ipsec [IRDA] irlan: Fix compile warning when CONFIG_PROC_FS=n [IrDA]: Incorrect TTP header reservation [IrDA]: PXA FIR code device model conversion [GENETLINK]: Fix misplaced command flags. [NETLIK]: Add a pointer to the Generic Netlink wiki page. [IPV6] RAW: Don't release unlocked sock. ...	2006-12-07 09:05:15 -08:00
Linus Torvalds	4522d58275	Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6 * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (156 commits) [PATCH] x86-64: Export smp_call_function_single [PATCH] i386: Clean up smp_tune_scheduling() [PATCH] unwinder: move .eh_frame to RODATA [PATCH] unwinder: fully support linker generated .eh_frame_hdr section [PATCH] x86-64: don't use set_irq_regs() [PATCH] x86-64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq [PATCH] x86-64: Make ix86 default to HIGHMEM4G instead of NOHIGHMEM [PATCH] i386: replace kmalloc+memset with kzalloc [PATCH] x86-64: remove remaining pc98 code [PATCH] x86-64: remove unused variable [PATCH] x86-64: Fix constraints in atomic_add_return() [PATCH] x86-64: fix asm constraints in i386 atomic_add_return [PATCH] x86-64: Correct documentation for bzImage protocol v2.05 [PATCH] x86-64: replace kmalloc+memset with kzalloc in MTRR code [PATCH] x86-64: Fix numaq build error [PATCH] x86-64: include/asm-x86_64/cpufeature.h isn't a userspace header [PATCH] unwinder: Add debugging output to the Dwarf2 unwinder [PATCH] x86-64: Clarify error message in GART code [PATCH] x86-64: Fix interrupt race in idle callback (3rd try) [PATCH] x86-64: Remove unwind stack pointer alignment forcing again ... Fixed conflict in include/linux/uaccess.h manually Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:59:11 -08:00
Paul Menage	d3ed11c356	[PATCH] cpuset: allow a larger buffer for writes to cpuset files When using fake NUMA setup, the number of memory nodes can greatly exceed the number of CPUs. So the current limit in cpuset_common_file_write() is insufficient. Signed-off-by: Paul Menage <menage@google.com> Acked-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:48 -08:00
Ingo Molnar	7929082250	[PATCH] add ignore_loglevel boot option Sometimes the kernel prints something interesting while userspace bootup keeps messages turned off via loglevel. Enable the printing of /all/ kernel messages via the "ignore_loglevel" boot option. Off by default. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:47 -08:00
Ingo Molnar	70e4506765	[PATCH] lockdep: register_lock_class() fix The hash_lock must only ever be taken with irqs disabled. This happens in all the important places, except one codepath: register_lock_class(). The race should trigger rarely because register_lock_class() is quite rare and single-threaded (happens during init most of the time). The fix is to disable irqs. ( bug found live in -rt: there preemption is alot more agressive and preempting with the hash-lock held caused a lockup.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:46 -08:00
Magnus Damm	85916f8166	[PATCH] Kexec / Kdump: Unify elf note code The elf note saving code is currently duplicated over several architectures. This cleanup patch simply adds code to a common file and then replaces the arch-specific code with calls to the newly added code. The only drawback with this approach is that s390 doesn't fully support kexec-on-panic which for that arch leads to introduction of unused code. Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: Andi Kleen <ak@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:46 -08:00
Helge Deller	15ad7cdcfd	[PATCH] struct seq_operations and struct file_operations constification - move some file_operations structs into the .rodata section - move static strings from policy_types[] array into the .rodata section - fix generic seq_operations usages, so that those structs may be defined as "const" as well [akpm@osdl.org: couple of fixes] Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:46 -08:00
Ralf Baechle	ccdea2f88b	[PATCH] futex: remove unneeded barrier When disassembling a kernel I found around over 90 sync Instructions from mb, rmb and wmb calls in the kernel and only few of those make any sense to me. So here's the first one - I think the wmb() in kernel/futex.c is not needed on uniprocessors so should become an smb_wmb(). Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:45 -08:00
Josh Triplett	012d3ca8d8	[PATCH] Add Sparse annotations to SRCU wrapper functions in rcutorture The SRCU wrapper functions srcu_torture_read_lock and srcu_torture_read_unlock in rcutorture intentionally change the SRCU context; annotate them accordingly, to avoid a warning. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:44 -08:00
David Rientjes	a09c17a6fd	[PATCH] sys: remove unused variable Remove unused 'new_ruid' variable. Reported by David Binderman <dcb314@hotmail.com>. Signed-off-by: David Rientjes <rientjes@cs.washington.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:44 -08:00
Adrian Bunk	045f147f32	[PATCH] remove EXPORT_UNUSED_SYMBOL'ed symbols In time for 2.6.20, we can get rid of this junk. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:44 -08:00
Zachary Amsden	3908fd2ed9	[PATCH] softirq: remove BUG_ONs which can incorrectly trigger It is possible to have tasklets get scheduled before softirqd has had a chance to spawn on all CPUs. This is totally harmless; after success during action CPU_UP_PREPARE, action CPU_ONLINE will be called, which immediately wakes softirqd on the appropriate CPU to process the already pending tasklets. So there is no danger of having a missed wakeup for any tasklets that were already pending. In particular, i386 is affected by this during startup, and is visible when using a very large initrd; during the time it takes for the initrd to be decompressed, a timer IRQ can come in and schedule RCU callbacks. It is also possible that resending of a hardware IRQ via a softirq triggers the same bug. Because of different timing conditions, this shows up in all emulators and virtual machines tested, including Xen, VMware, Virtual PC, and Qemu. It is also possible to trigger on native hardware with a large enough initrd, although I don't have a reliable case demonstrating that. Signed-off-by: Zachary Amsden <zach@vmware.com> Cc: <caglar@pardus.org.tr> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:43 -08:00
Ingo Molnar	2ee91f197c	[PATCH] lockdep: show more details about self-test failures Make the locking self-test failures (of 'FAILURE' type) easier to debug by printing more information. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:43 -08:00
Ingo Molnar	50cc670aeb	[PATCH] lockdep: more chains Some have reported a chain-table overflow - double its size. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:43 -08:00
Chris Caputo	301827acbe	[PATCH] sched: correct output of show_state() At present show_state prints a header the does not match the output of show_task, as follows: - sibling task PC pid father child younger older init S 00000000 0 1 0 2 (NOTLB) - This patch corrects the output of show_state so that the header is aligned with the data, ala: - free sibling task PC stack pid father child younger older init S 00000000 0 1 0 2 (NOTLB) - Signed-off-by: Chris Caputo <ccaputo@alt.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:42 -08:00
BP, Praveen	bd9b0bac6f	[PATCH] sysctl: string length calculated is wrong if it contains negative numbers In the functions do_proc_dointvec() and do_proc_doulongvec_minmax(), there seems to be a bug in string length calculation if string contains negative integer. The console log given below explains the bug. Setting negative values may not be a right thing to do for "console log level" but then the test (given below) can be used to demonstrate the bug in the code. # echo "-1 -1 -1 -123456" > /proc/sys/kernel/printk # cat /proc/sys/kernel/printk -1 -1 -1 -1234 # # echo "-1 -1 -1 123456" > /proc/sys/kernel/printk # cat /proc/sys/kernel/printk -1 -1 -1 1234 # (akpm: the bug is that 123456 gets truncated) It works as expected if string contains all +ve integers # echo "1 2 3 4" > /proc/sys/kernel/printk # cat /proc/sys/kernel/printk 1 2 3 4 # The patch given below fixes the issue. Signed-off-by: Praveen BP <praveenbp@ti.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:42 -08:00
Akinobu Mita	95362fa903	[PATCH] futex: init error check Check register_filesystem() and kern_mount() return values. Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:41 -08:00
Burman Yan	4668edc334	[PATCH] kernel core: replace kmalloc+memset with kzalloc Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:41 -08:00
Eric Dumazet	1c69d921ed	[PATCH] rcu: add a prefetch() in rcu_do_batch() On some workloads, (for example when lot of close() syscalls are done), RCU qlen can be quite large, and RCU heads are no longer in cpu cache when rcu_do_batch() is called. This patch adds a prefetch() in rcu_do_batch() to give CPU a hint to bring back cache lines containing 'struct rcu_head's. Most list manipulations macros include prefetch(), but not open coded ones (at least with current C compilers :) ) I got a nice speedup on a trivial benchmark (3.48 us per iteration instead of 3.95 us on a 1.6 GHz Pentium-M) while (1) { pipe(p); close(fd[0]); close(fd[1]);} Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:40 -08:00

... 2 3 4 5 6 ...

1996 Commits