- constify and optimize stat_nam (thanks to Michael Tokarev!)
- spelling and comment fixes
Signed-off-by: Andreas Mohr <andi@lisas.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Problem:
In the function __migrate_task(), deactivate_task() followed by
activate_task() is used to move the task from one run queue to
another. This has two undesirable effects:
1. The task's priority is recalculated. (Nowhere else in the
scheduler code is the priority recalculated for a change of CPU.)
2. The task's time stamp is set to the current time. At the very least,
this makes the adjustment of the time stamp before the call to
deactivate_task() redundant but I believe the problem is more serious
as the time stamp now holds the time of the queue change instead of
the time at which the task was woken. In addition, unless dest_rq is
the same queue as "current" is on the time stamp could be inaccurate
due to inter CPU drift.
Solution:
Replace the call to activate_task() with one to __activate_task().
Signed-off-by: Peter Williams <pwil3058@bigpond.net.au>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
Move workqueue exports to where the functions are defined.
[CPUFREQ] Misc cleanups in ondemand.
[CPUFREQ] Make ondemand sampling per CPU and remove the mutex usage in sampling path.
[CPUFREQ] Add queue_delayed_work_on() interface for workqueues.
[CPUFREQ] Remove slowdown from ondemand sampling path.
Jiri reports that the stop_machin kthread conversion caused his machine to
hang when suspending. Hyperthreading is apparently involved.
I don't see why that would be and I can't reproduce it. Revert to the 2.6.17
code.
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: add defconfig for Freescale MPC8349E-mITX board
powerpc: Add base support for the Freescale MPC8349E-mITX eval board
Documentation: correct values in MPC8548E SEC example node
[POWERPC] Actually copy over i8259.c to arch/ppc/syslib this time
[POWERPC] Add new interrupt mapping core and change platforms to use it
[POWERPC] Copy i8259 code back to arch/ppc
[POWERPC] New device-tree interrupt parsing code
[POWERPC] Use the genirq framework
[PATCH] genirq: Allow fasteoi handler to retrigger disabled interrupts
[POWERPC] Update the SWIM3 (powermac) floppy driver
[POWERPC] Fix error handling in detecting legacy serial ports
[POWERPC] Fix booting on Momentum "Apache" board (a Maple derivative)
[POWERPC] Fix various offb and BootX-related issues
[POWERPC] Add a default config for 32-bit CHRP machines
[POWERPC] fix implicit declaration on cell.
[POWERPC] change get_property to return void *
convert:
- runqueue_t to 'struct rq'
- prio_array_t to 'struct prio_array'
- migration_req_t to 'struct migration_req'
I was the one who added these but they are both against the kernel coding
style and also were used inconsistently at places. So just get rid of them at
once, now that we are flushing the scheduler patch-queue anyway.
Conversion was mostly scripted, the result was reviewed and all secondary
whitespace and style impact (if any) was fixed up by hand.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cleanup: remove task_t and convert all the uses to struct task_struct. I
introduced it for the scheduler anno and it was a mistake.
Conversion was mostly scripted, the result was reviewed and all
secondary whitespace and style impact (if any) was fixed up by hand.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up some of the impact of recent (and not so recent) scheduler
changes:
- turning macros into nice inline functions
- sanitizing and unifying variable definitions
- whitespace, style consistency, 80-lines, comment correctness, spelling
and curly braces police
Due to the macro hell and variable placement simplifications there's even 26
bytes of .text saved:
text data bss dec hex filename
25510 4153 192 29855 749f sched.o.before
25484 4153 192 29829 7485 sched.o.after
[akpm@osdl.org: build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
At the moment, powerpc and s390 have their own versions of do_softirq which
include local_bh_disable() and __local_bh_enable() calls. They end up
calling __do_softirq (in kernel/softirq.c) which also does
local_bh_disable/enable.
Apparently the two levels of disable/enable trigger a warning from some
validation code that Ingo is working on, and he would like to see the outer
level removed. But to do that, we have to move the account_system_vtime
calls that are currently in the arch do_softirq() implementations for
powerpc and s390 into the generic __do_softirq() (this is a no-op for other
archs because account_system_vtime is defined to be an empty inline
function on all other archs). This patch does that.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
lockdep needs to have the waitqueue lock initialized for on-stack waitqueues
implicitly initialized by DECLARE_COMPLETION(). Annotate on-stack completions
accordingly.
Has no effect on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make use of local_irq_enable_in_hardirq() API to annotate places that enable
hardirqs in hardirq context.
Has no effect on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Teach special (recursive) locking code to the lock validator. Has no effect
on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Teach special (recursive) locking code to the lock validator. Has no effect
on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Teach per-CPU runqueue locks and recursive locking code to the lock validator.
Has no effect on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Split the per-CPU timer base locks up into separate lock classes, because they
are used recursively.
Has no effect on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Create one lock class for all waitqueue locks in the kernel. Has no effect on
non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Teach special (recursive) locking code to the lock validator. Has no effect
on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Teach special (recursive) locking code to the lock validator. Introduces
double_lock_hb() to unify double- hash-bucket-lock taking.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make printk()-ing from within the lock validation code safer by using the
lockdep-recursion counter.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the lock validator framework to prove mutex locking correctness.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the lock validator framework to prove spinlock and rwlock locking
correctness.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Use the lock validator framework to prove rwsem locking correctness.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Lock validator /proc/lockdep and /proc/lockdep_stats support.
(FIXME: should go into debugfs)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Ingo Molnar <mingo@elte.hu>
lockdep so far only allowed read-recursion for the same lock instance.
This is enough in the overwhelming majority of cases, but a hostap case
triggered and reported by Miles Lane relies on same-class
different-instance recursion. So we relax the restriction on read-lock
recursion.
(This change does not allow rwsem read-recursion, which is still
forbidden.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Do 'make oldconfig' and accept all the defaults for new config options -
reboot into the kernel and if everything goes well it should boot up fine and
you should have /proc/lockdep and /proc/lockdep_stats files.
Typically if the lock validator finds some problem it will print out
voluminous debug output that begins with "BUG: ..." and which syslog output
can be used by kernel developers to figure out the precise locking scenario.
What does the lock validator do? It "observes" and maps all locking rules as
they occur dynamically (as triggered by the kernel's natural use of spinlocks,
rwlocks, mutexes and rwsems). Whenever the lock validator subsystem detects a
new locking scenario, it validates this new rule against the existing set of
rules. If this new rule is consistent with the existing set of rules then the
new rule is added transparently and the kernel continues as normal. If the
new rule could create a deadlock scenario then this condition is printed out.
When determining validity of locking, all possible "deadlock scenarios" are
considered: assuming arbitrary number of CPUs, arbitrary irq context and task
context constellations, running arbitrary combinations of all the existing
locking scenarios. In a typical system this means millions of separate
scenarios. This is why we call it a "locking correctness" validator - for all
rules that are observed the lock validator proves it with mathematical
certainty that a deadlock could not occur (assuming that the lock validator
implementation itself is correct and its internal data structures are not
corrupted by some other kernel subsystem). [see more details and conditionals
of this statement in include/linux/lockdep.h and
Documentation/lockdep-design.txt]
Furthermore, this "all possible scenarios" property of the validator also
enables the finding of complex, highly unlikely multi-CPU multi-context races
via single single-context rules, increasing the likelyhood of finding bugs
drastically. In practical terms: the lock validator already found a bug in
the upstream kernel that could only occur on systems with 3 or more CPUs, and
which needed 3 very unlikely code sequences to occur at once on the 3 CPUs.
That bug was found and reported on a single-CPU system (!). So in essence a
race will be found "piecemail-wise", triggering all the necessary components
for the race, without having to reproduce the race scenario itself! In its
short existence the lock validator found and reported many bugs before they
actually caused a real deadlock.
To further increase the efficiency of the validator, the mapping is not per
"lock instance", but per "lock-class". For example, all struct inode objects
in the kernel have inode->inotify_mutex. If there are 10,000 inodes cached,
then there are 10,000 lock objects. But ->inotify_mutex is a single "lock
type", and all locking activities that occur against ->inotify_mutex are
"unified" into this single lock-class. The advantage of the lock-class
approach is that all historical ->inotify_mutex uses are mapped into a single
(and as narrow as possible) set of locking rules - regardless of how many
different tasks or inode structures it took to build this set of rules. The
set of rules persist during the lifetime of the kernel.
To see the rough magnitude of checking that the lock validator does, here's a
portion of /proc/lockdep_stats, fresh after bootup:
lock-classes: 694 [max: 2048]
direct dependencies: 1598 [max: 8192]
indirect dependencies: 17896
all direct dependencies: 16206
dependency chains: 1910 [max: 8192]
in-hardirq chains: 17
in-softirq chains: 105
in-process chains: 1065
stack-trace entries: 38761 [max: 131072]
combined max dependencies: 2033928
hardirq-safe locks: 24
hardirq-unsafe locks: 176
softirq-safe locks: 53
softirq-unsafe locks: 137
irq-safe locks: 59
irq-unsafe locks: 176
The lock validator has observed 1598 actual single-thread locking patterns,
and has validated all possible 2033928 distinct locking scenarios.
More details about the design of the lock validator can be found in
Documentation/lockdep-design.txt, which can also found at:
http://redhat.com/~mingo/lockdep-patches/lockdep-design.txt
[bunk@stusta.de: cleanups]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Accurate hard-IRQ-flags and softirq-flags state tracing.
This allows us to attach extra functionality to IRQ flags on/off
events (such as trace-on/off).
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Framework to generate and save stacktraces quickly, without printing anything
to the console.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Locking init improvement:
- introduce and use __SPIN_LOCK_UNLOCKED for array initializations,
to pass in the name string of locks, used by debugging
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Work around weird section nesting build bug causing smp-alternatives failures
under certain circumstances.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Generic lock debugging:
- generalized lock debugging framework. For example, a bug in one lock
subsystem turns off debugging in all lock subsystems.
- got rid of the caller address passing (__IP__/__IP_DECL__/etc.) from
the mutex/rtmutex debugging code: it caused way too much prototype
hackery, and lockdep will give the same information anyway.
- ability to do silent tests
- check lock freeing in vfree too.
- more finegrained debugging options, to allow distributions to
turn off more expensive debugging features.
There's no separate 'held mutexes' list anymore - but there's a 'held locks'
stack within lockdep, which unifies deadlock detection across all lock
classes. (this is independent of the lockdep validation stuff - lockdep first
checks whether we are holding a lock already)
Here are the current debugging options:
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
which do:
config DEBUG_MUTEXES
bool "Mutex debugging, basic checks"
config DEBUG_LOCK_ALLOC
bool "Detect incorrect freeing of live mutexes"
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With the lock validator we detect mutex deadlocks (and more), the mutex
deadlock checking code is both redundant and slower. So remove it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Rename DEBUG_WARN_ON() to the less generic DEBUG_LOCKS_WARN_ON() name, so that
it's clear that this is a lock-debugging internal mechanism.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add is_module_address() method - to be used by lockdep.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It turns out that it is advantageous to leave a small portion of unmapped file
backed pages if all of a zone's pages (or almost all pages) are allocated and
so the page allocator has to go off-node.
This allows recently used file I/O buffers to stay on the node and
reduces the times that zone reclaim is invoked if file I/O occurs
when we run out of memory in a zone.
The problem is that zone reclaim runs too frequently when the page cache is
used for file I/O (read write and therefore unmapped pages!) alone and we have
almost all pages of the zone allocated. Zone reclaim may remove 32 unmapped
pages. File I/O will use these pages for the next read/write requests and the
unmapped pages increase. After the zone has filled up again zone reclaim will
remove it again after only 32 pages. This cycle is too inefficient and there
are potentially too many zone reclaim cycles.
With the 1% boundary we may still remove all unmapped pages for file I/O in
zone reclaim pass. However. it will take a large number of read and writes
to get back to 1% again where we trigger zone reclaim again.
The zone reclaim 2.6.16/17 does not show this behavior because we have a 30
second timeout.
[akpm@osdl.org: rename the /proc file and the variable]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make the fasteoi handler mark disabled interrupts as pending if they
happen anyway. This allow implementation of a delayed disable scheme
with the fasteoi handler.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The irqflags consolidation converted SA_PERCPU_IRQ to IRQF_PERCPU but
did not define the new constant.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus: "The hacks in kernel/irq/handle.c are really horrid. REALLY
horrid."
They are indeed. Move the dyntick quirks to ARM where they belong.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Thomas Gleixner
From: Thomas Gleixner <tglx@linutronix.de>
ARM has a couple of really dumb interrupt controllers.
Implement a generic one and fixup the ARM migration. ARM reused
the no_irq_chip for this purpose, but this does not work out
for platforms which are not converted to the new interrupt
type handling model.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Thomas Gleixner
From: Thomas Gleixner <tglx@linutronix.de>
Make the ARM dyntick implementation work with the generic
irq code. This hopefully goes away once we consolidated the
dyntick implementations.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
IRQ_PER_CPU is a bit in the struct irq_desc "status" field, not in the
struct irqaction "flags", so the previous code checked the wrong bit.
SA_PERCPU_IRQ is only used by drivers/char/mmtimer.c for SGI ia64 boxes.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix futex_wake() exit condition bug when handling the robust-list with PI
futexes on them.
(reported by Ulrich Drepper, debugged by the lock validator.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
lock_queue was getting called essentially twice in a row and was
continually incrementing the mm_count ref count, thus causing a memory
leak.
Dinakar Guniguntala provided a proper fix for the problem that simply grabs
the spinlock for the hash bucket queue rather than calling lock_queue.
The second time we do a queue_lock in futex_lock_pi, we really only need to
take the hash bucket lock.
Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>
Signed-off-by: Vernon Mauery <vernux@us.ibm.com>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Allow to tie upper bits of syscall bitmap in audit rules to kernel-defined
sets of syscalls. Infrastructure, a couple of classes (with 32bit counterparts
for biarch targets) and actual tie-in on i386, amd64 and ia64.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch introduces object audit filters based on the elements
of the SELinux context.
Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
kernel/auditfilter.c | 25 +++++++++++++++++++++++++
kernel/auditsc.c | 40 ++++++++++++++++++++++++++++++++++++++++
security/selinux/ss/services.c | 18 +++++++++++++++++-
3 files changed, 82 insertions(+), 1 deletion(-)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch renames some audit constant definitions and adds
additional definitions used by the following patch. The renaming
avoids ambiguity with respect to the new definitions.
Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com>
include/linux/audit.h | 15 ++++++++----
kernel/auditfilter.c | 50 ++++++++++++++++++++---------------------
kernel/auditsc.c | 10 ++++----
security/selinux/ss/services.c | 32 +++++++++++++-------------
4 files changed, 56 insertions(+), 51 deletions(-)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add support for a rule key, which can be used to tie audit records to audit
rules. This is useful when a watched file is accessed through a link or
symlink, as well as for general audit log analysis.
Because this patch uses a string key instead of an integer key, there is a bit
of extra overhead to do the kstrdup() when a rule fires. However, we're also
allocating memory for the audit record buffer, so it's probably not that
significant. I went ahead with a string key because it seems more
user-friendly.
Note that the user must ensure that filterkeys are unique. The kernel only
checks for duplicate rules.
Signed-off-by: Amy Griffis <amy.griffis@hpd.com>
Fix a bug identified by Zou Nan hai <nanhai.zou@intel.com>:
If the system is in state SYSTEM_BOOTING, and need_resched() is true,
cond_resched() returns true even though it didn't reschedule. Consequently
need_resched() remains true and JBD locks up.
Fix that by teaching cond_resched() to only return true if it really did call
schedule().
cond_resched_lock() and cond_resched_softirq() have a problem too. If we're
in SYSTEM_BOOTING state and need_resched() is true, these functions will drop
the lock and will then try to call schedule(), but the SYSTEM_BOOTING state
will prevent schedule() from being called. So on return, need_resched() will
still be true, but cond_resched_lock() has to return 1 to tell the caller that
the lock was dropped. The caller will probably lock up.
Bottom line: if these functions dropped the lock, they _must_ call schedule()
to clear need_resched(). Make it so.
Also, uninline __cond_resched(). It's largeish, and slowpath.
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a call to the extended security_task_kill hook introduced by
the prior patch to the kill_proc_info_as_uid function so that these signals
can be properly mediated by security modules. It also updates the existing
hook call in check_kill_permission.
Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The zone_reclaim_interval was necessary because we were not able to determine
how many unmapped pages exist in a zone. Therefore we had to scan in
intervals to figure out if any pages were unmapped.
With the zoned counters and NR_ANON_PAGES we now know the number of pagecache
pages and the number of mapped pages in a zone. So we can simply skip the
reclaim if there is an insufficient number of unmapped pages. We use
SWAP_CLUSTER_MAX as the boundary.
Drop all support for /proc/sys/vm/zone_reclaim_interval.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove SWSUSP_ENCRYPT config option; it is no longer implemented.
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This patch encapsulates the usage of eff_cap (in netlink_skb_params) within
the security framework by extending security_netlink_recv to include a required
capability parameter and converting all direct usage of eff_caps outside
of the lsm modules to use the interface. It also updates the SELinux
implementation of the security_netlink_send and security_netlink_recv
hooks to take advantage of the sid in the netlink_skb_params struct.
This also enables SELinux to perform auditing of netlink capability checks.
Please apply, for 2.6.18 if possible.
Signed-off-by: Darrel Goeddel <dgoeddel@trustedcs.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (43 commits)
[POWERPC] Use little-endian bit from firmware ibm,pa-features property
[POWERPC] Make sure smp_processor_id works very early in boot
[POWERPC] U4 DART improvements
[POWERPC] todc: add support for Time-Of-Day-Clock
[POWERPC] Make lparcfg.c work when both iseries and pseries are selected
[POWERPC] Fix idr locking in init_new_context
[POWERPC] mpc7448hpc2 (taiga) board config file
[POWERPC] Add tsi108 pci and platform device data register function
[POWERPC] Add general support for mpc7448hpc2 (Taiga) platform
[POWERPC] Correct the MAX_CONTEXT definition
powerpc: minor cleanups for mpc86xx
[POWERPC] Make sure we select CONFIG_NEW_LEDS if ADB_PMU_LED is set
[POWERPC] Simplify the code defining the 64-bit CPU features
[POWERPC] powerpc: kconfig warning fix
[POWERPC] Consolidate some of kernel/misc*.S
[POWERPC] Remove unused function call_with_mmu_off
[POWERPC] update asm-powerpc/time.h
[POWERPC] Clean up it_lp_queue.h
[POWERPC] Skip the "copy down" of the kernel if it is already at zero.
[POWERPC] Add the use of the firmware soft-reset-nmi to kdump.
...
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
[PATCH] i386: export memory more than 4G through /proc/iomem
[PATCH] 64bit Resource: finally enable 64bit resource sizes
[PATCH] 64bit Resource: convert a few remaining drivers to use resource_size_t where needed
[PATCH] 64bit resource: change pnp core to use resource_size_t
[PATCH] 64bit resource: change pci core and arch code to use resource_size_t
[PATCH] 64bit resource: change resource core to use resource_size_t
[PATCH] 64bit resource: introduce resource_size_t for the start and end of struct resource
[PATCH] 64bit resource: fix up printks for resources in misc drivers
[PATCH] 64bit resource: fix up printks for resources in arch and core code
[PATCH] 64bit resource: fix up printks for resources in pcmcia drivers
[PATCH] 64bit resource: fix up printks for resources in video drivers
[PATCH] 64bit resource: fix up printks for resources in ide drivers
[PATCH] 64bit resource: fix up printks for resources in mtd drivers
[PATCH] 64bit resource: fix up printks for resources in pci core and hotplug drivers
[PATCH] 64bit resource: fix up printks for resources in networks drivers
[PATCH] 64bit resource: fix up printks for resources in sound drivers
[PATCH] 64bit resource: C99 changes for struct resource declarations
Fixed up trivial conflict in drivers/ide/pci/cmd64x.c (the printk that
was changed by the 64-bit resources had been deleted in the meantime ;)
Clean up the fastack concept by turning it into fasteoi and introducing the
->eoi() method for chips.
This also allows the cleanup of an i386 EOI quirk - now the quirk is
cleanly separated from the pure ACK implementation.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Note when a disable interrupt happened with the fasteoi handler as well so
that delayed disable can be implemented with fasteoi-type controllers.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One frequent sign of IRQ handling bugs is the appearance of unexpected
vectors. Print out all the IRQ state in that case. We dont want this patch
upstream, but it is useful during initial testing.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable drivers to request an IRQ with a given irq-flow (trigger/polarity)
setting.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable platforms to set the irq-wake (power-management) properties of an IRQ.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Handle bad IRQ vectors via the irqchip mechanism.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable platforms to use the irq-chip and irq-flow abstractions: allow setting
of the chip, the type and provide highlevel handlers for common irq-flows.
[rostedt@goodmis.org: misroute-irq: Don't call desc->chip->end because of edge interrupts]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable platforms to disable the automatic enabling of freshly set up irqs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable platforms to disable request_irq() for certain interrupts.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Enable platforms that do not have a hardware-assisted hardirq-resend mechanism
to resend them via a softirq-driven IRQ emulation mechanism.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up no_irq_type: share the NOP functions where possible, and properly
name the ack_bad() function.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add ->retrigger() irq op to consolidate hw_irq_resend() implementations.
(Most architectures had it defined to NOP anyway.)
NOTE: ia64 needs testing. i386 and x86_64 tested.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Consolidation: remove the pending_irq_cpumask[NR_IRQS] array and move it into
the irq_desc[NR_IRQS].pending_mask field.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Consolidation: remove the irq_dir[NR_IRQS] and the smp_affinity_entry[NR_IRQS]
arrays and move them into the irq_desc[] array.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Cleanup: remove irq_desc_t use from the generic IRQ code, and mark it
obsolete.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that i386 defaults to regparm, explicit uses of fastcall are not needed
anymore.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Cleanup: remove irq_descp() - explicit use of irq_desc[] is shorter and more
readable.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Consolidation: remove the irq_affinity[NR_IRQS] array and move it into the
irq_desc[NR_IRQS].affinity field.
[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Convert the irq auto-probing semaphore to a mutex. (This allows us to find
probing API usage bugs sooner, via the mutex debugging code.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch-queue improves the generic IRQ layer to be truly generic, by adding
various abstractions and features to it, without impacting existing
functionality.
While the queue can be best described as "fix and improve everything in the
generic IRQ layer that we could think of", and thus it consists of many
smaller features and lots of cleanups, the one feature that stands out most is
the new 'irq chip' abstraction.
The irq-chip abstraction is about describing and coding and IRQ controller
driver by mapping its raw hardware capabilities [and quirks, if needed] in a
straightforward way, without having to think about "IRQ flow"
(level/edge/etc.) type of details.
This stands in contrast with the current 'irq-type' model of genirq
architectures, which 'mixes' raw hardware capabilities with 'flow' details.
The patchset supports both types of irq controller designs at once, and
converts i386 and x86_64 to the new irq-chip design.
As a bonus side-effect of the irq-chip approach, chained interrupt controllers
(master/slave PIC constructs, etc.) are now supported by design as well.
The end result of this patchset intends to be simpler architecture-level code
and more consolidation between architectures.
We reused many bits of code and many concepts from Russell King's ARM IRQ
layer, the merging of which was one of the motivations for this patchset.
This patch:
rename desc->handler to desc->chip.
Originally i did not want to do this, because it's a big patch. But having
both "desc->handler", "desc->handle_irq" and "action->handler" caused a
large degree of confusion and made the code appear alot less clean than it
truly is.
I have also attempted a dual approach as well by introducing a
desc->chip alias - but that just wasnt robust enough and broke
frequently.
So lets get over with this quickly. The conversion was done automatically
via scripts and converts all the code in the kernel.
This renaming patch is the first one amongst the patches, so that the
remaining patches can stay flexible and can be merged and split up
without having some big monolithic patch act as a merge barrier.
[akpm@osdl.org: build fix]
[akpm@osdl.org: another build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Temporarily add EXPORT_UNUSED_SYMBOL and EXPORT_UNUSED_SYMBOL_GPL. These
will be used as a transition measure for symbols that aren't used in the
kernel and are on the way out. When a module uses such a symbol, a warning
is printk'd at modprobe time.
The main reason for removing unused exports is size: eacho export takes
roughly between 100 and 150 bytes of kernel space in the binary. This
patch gives users the option to immediately get this size gain via a config
option.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With this patch, kdump uses the firmware soft-reset NMI for two purposes:
1) Initiate the kdump (take a crash dump) by issuing a soft-reset.
2) Break a CPU out of a deadlock condition that is detected during kdump
processing.
When a soft-reset is initiated each CPU will enter
system_reset_exception() and set its corresponding bit in the global
bit-array cpus_in_sr then call die(). When die() finds the CPU's bit set
in cpu_in_sr crash_kexec() is called to initiate a crash dump. The first
CPU to enter crash_kexec() is called the "crashing CPU". All other CPUs
are "secondary CPUs". The secondary CPU's pass through to
crash_kexec_secondary() and sleep. The crashing CPU waits for all CPUs
to enter via soft-reset then boots the kdump kernel (see
crash_soft_reset_check())
When the system crashes due to a panic or exception, crash_kexec() is
called by panic() or die(). The crashing CPU sends an IPI to all other
CPUs to notify them of the pending shutdown. If a CPU is in a deadlock
or hung state with interrupts disabled, the IPI will not be delivered.
The result being, that the kdump kernel is not booted. This problem is
solved with the use of a firmware generated soft-reset. After the
crashing_cpu has issued the IPI, it waits for 10 sec for all CPUs to
enter crash_ipi_callback(). A CPU signifies its entry to
crash_ipi_callback() by setting its corresponding bit in the
cpus_in_crash bit array. After 10 sec, if one or more CPUs have not set
their bit in cpus_in_crash we assume that the CPU(s) is deadlocked. The
operator is then prompted to generate a soft-reset to break the
deadlock. Each CPU enters the soft reset handler as described above.
Two conditions must be handled at this point:
1) The system crashed because the operator generated a soft-reset. See
2) The system had crashed before the soft-reset was generated ( in the
case of a Panic or oops).
The first CPU to enter crash_kexec() uses the state of the kexec_lock to
determine this state. If kexec_lock is already held then condition 2 is
true and crash_kexec_secondary() is called, else; this CPU is flagged as
the crashing CPU, the kexec_lock is acquired and crash_kexec() proceeds
as described above.
Each additional CPUs responding to the soft-reset will pass through
crash_kexec() to kexec_secondary(). All secondary CPUs call
crash_ipi_callback() readying them self's for the shutdown. When ready
they clear their bit in cpus_in_sr. The crashing CPU waits in
kexec_secondary() until all other CPUs have cleared their bits in
cpus_in_sr. The kexec kernel boot is then started.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
In futex_requeue(), when the 2 futexes keys hash to the same bucket, there
is no need to move the futex_q to the end of the bucket list.
Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When the priority of a task, which is blocked on a lock, changes we must
propagate this change into the PI lock chain. Therefor the chain walk code
is changed to get rid of the references to current to avoid false positives
in the deadlock detector, as setscheduler might be called by a task which
holds the lock on which the task whose priority is changed is blocked.
Also add some comments about the get/put_task_struct usage to avoid
confusion.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make test suite setscheduler calls asynchronously. Remove the waits in the
test cases and add a new testcase to verify the correctness of the
setscheduler priority propagation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is no need to hold tasklist_lock across the setscheduler call, when
we pin the task structure with get_task_struct(). Interrupts are disabled
in setscheduler anyway and the permission checks do not need interrupts
disabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>