signal_struct is (mostly) protected by ->sighand->siglock, I think we don't
need ->taskstats_lock to protect ->stats. This also allows us to simplify the
locking in fill_tgid().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
taskstats_tgid_free() is called on copy_process's error path. This is wrong.
IF (clone_flags & CLONE_THREAD)
We should not clear ->signal->taskstats, current uses it,
it probably has a valid accumulated info.
ELSE
taskstats_tgid_init() set ->signal->taskstats = NULL,
there is nothing to free.
Move the callsite to __exit_signal(). We don't need any locking, entire
thread group is exiting, nobody should have a reference to soon to be
released ->signal.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This means we can call it when the bitmap we want to fetch is declared
const.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If __vmalloc is called to allocate memory with GFP_ATOMIC in atomic
context, the chain of calls results in __get_vm_area_node allocating memory
for vm_struct with GFP_KERNEL, causing the 'sleeping from invalid context'
warning. This patch fixes it by passing the gfp flags along so
__get_vm_area_node allocates memory for vm_struct with the same flags.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The temp_priority field in zone is racy, as we can walk through a reclaim
path, and just before we copy it into prev_priority, it can be overwritten
(say with DEF_PRIORITY) by another reclaimer.
The same bug is contained in both try_to_free_pages and balance_pgdat, but
it is fixed slightly differently. In balance_pgdat, we keep a separate
priority record per zone in a local array. In try_to_free_pages there is
no need to do this, as the priority level is the same for all zones that we
reclaim from.
Impact of this bug is that temp_priority is copied into prev_priority, and
setting this artificially high causes reclaimers to set distress
artificially low. They then fail to reclaim mapped pages, when they are,
in fact, under severe memory pressure (their priority may be as low as 0).
This causes the OOM killer to fire incorrectly.
From: Andrew Morton <akpm@osdl.org>
__zone_reclaim() isn't modifying zone->prev_priority. But zone->prev_priority
is used in the decision whether or not to bring mapped pages onto the inactive
list. Hence there's a risk here that __zone_reclaim() will fail because
zone->prev_priority ir large (ie: low urgency) and lots of mapped pages end up
stuck on the active list.
Fix that up by decreasing (ie making more urgent) zone->prev_priority as
__zone_reclaim() scans the zone's pages.
This bug perhaps explains why ZONE_RECLAIM_PRIORITY was created. It should be
possible to remove that now, and to just start out at DEF_PRIORITY?
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- Consolidate page_cache_alloc
- Fix splice: only the pagecache pages and filesystem data need to use
mapping_gfp_mask.
- Fix grab_cache_page_nowait: same as splice, also honour NUMA placement.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The multithreaded-probing code has a problem: after one initcall level (eg,
core_initcall) has been processed, we will then start processing the next
level (postcore_initcall) while the kernel threads which are handling
core_initcall are still executing. This breaks the guarantees which the
layered initcalls previously gave us.
IOW, we want to be multithreaded _within_ an initcall level, but not between
different levels.
Fix that up by causing the probing code to wait for all outstanding probes at
one level to complete before we start processing the next level.
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a vmlinux.lds.h helper macro for defining the eight-level initcall table,
teach all the architectures to use it.
This is a prerequisite for a patch which performs initcall synchronisation for
multithreaded-probing.
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
[ Added AVR32 as well ]
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A number of new drivers require io{read,write}{8,16,32}{be,} family of io
operations. These are provided for the AVR32 by this patch in the form of
a series of macros.
Access to the (memory mapped) io space through these macros is defined to
be little endian only as little endian devices (such as PCI) are the main
consumer of IO access. If high speed access is required,
io{read,write}{16,32}be macros are supplied to perform native big endian
access to this io space.
Signed-off-by: Ben Nizette <ben@mallochdigital.com>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When calling e.g. atomic_sub_return with a large constant, the
compiler may output an immediate that is too large for the sub
instruction in the middle of the loop.
Fix this by explicitly specifying the number of bits allowed in the
constraint. Also stop atomic_add_return() and friends from falling
back to their respective "sub" variants if the constant is too large
to fit in an immediate.
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
[PATCH] x86-64: Only look at per_cpu data for online cpus.
[PATCH] x86-64: Simplify the vector allocator.
* 'merge' of master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Make sure __cpu_preinit_ppc970 gets called on 970GX processors
[POWERPC] Fix CHRP platforms with only 8259
[POWERPC] IPIC: Fix spinlock recursion in set_irq_handler
[POWERPC] Fix the UCC rx/tx clock of QE
[POWERPC] cell: update defconfig
[POWERPC] spufs: fix another off-by-one bug in spufs_mbox_read
[POWERPC] spufs: fix signal2 file to report signal2
[POWERPC] Fix device_is_compatible() const warning
[POWERPC] Cell timebase bug workaround
[POWERPC] Support feature fixups in modules
[POWERPC] Support feature fixups in vdso's
[POWERPC] Support nested cpu feature sections
[POWERPC] Consolidate feature fixup code
[POWERPC] Fix hang in start_ldr if _end or _edata is unaligned
[POWERPC] Fix spelling errors in ucc_fast.c and ucc_slow.c
[POWERPC] Don't require execute perms on wrapper when building zImage.initrd
[POWERPC] Add 970GX cputable entry
[POWERPC] Fix build breakage with CONFIG_PPC32
[POWERPC] Fix compiler warning message on get_property call
[POWERPC] Simplify stolen time calculation
This patch adds support for REPORT TARGET PORT GROUPS. This is used
eg for the multipathing priority callout to determine the path
priority.
With this patch multipath-tools can use the existing mpath_prio_alua
callout to exercise the path priority grouping.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If connection creation fails we end up calling list_del
on a invalid struct. This then causes an oops. We are not
acutally using the lists (old MCS code we thought might
be useful elsewhere) so this patch just removes that
code.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The transport class recv mempools are causing slab corruption.
We could hack around netlink's lack of mempool support like dm,
but it is just too ulgy (dm's hack is ugly enough :) when you need
to support broadcast.
This patch removes the recv pools. We have not used them even when
we were allocting 20 MB per session and the system only had 64 MBs.
And we have no pools on the send side and have been ok there. When
Peter's work gets merged we can use that since the network guys
are in favor of that approach and are not going to add mempools
everywhere.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
On CHRP platforms with only a 8259 controller, we should set the
default IRQ host to the 8259 driver's one for the IRQ probing
fallbacks to work in case the IRQ tree is incorrect (like on
Pegasos for example). Without this fix, we get a bunch of WARN_ON's
during boot.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix a const'ification related warning with device_is_compatible()
and friends related to get_property() not properly having const
on it's input device node argument.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The Cell CPU timebase has an erratum. When reading the entire 64 bits
of the timebase with one mftb instruction, there is a handful of cycles
window during which one might read a value with the low order 32 bits
already reset to 0x00000000 but the high order bits not yet incremeted
by one. This fixes it by reading the timebase again until the low order
32 bits is no longer 0. That might introduce occasional latencies if
hitting mftb just at the wrong time, but no more than 70ns on a cell
blade, and that was considered acceptable.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch reworks the feature fixup mecanism so vdso's can be fixed up.
The main issue was that the construct:
.long label (or .llong on 64 bits)
will not work in the case of a shared library like the vdso. It will
generate an empty placeholder in the fixup table along with a reloc,
which is not something we can deal with in the vdso.
The idea here (thanks Alan Modra !) is to instead use something like:
1:
.long label - 1b
That is, the feature fixup tables no longer contain addresses of bits of
code to patch, but offsets of such code from the fixup table entry
itself. That is properly resolved by ld when building the .so's. I've
modified the fixup mecanism generically to use that method for the rest
of the kernel as well.
Another trick is that the 32 bits vDSO included in the 64 bits kernel
need to have a table in the 64 bits format. However, gas does not
support 32 bits code with a statement of the form:
.llong label - 1b (Or even just .llong label)
That is, it cannot emit the right fixup/relocation for the linker to use
to assign a 32 bits address to an .llong field. Thus, in the specific
case of the 32 bits vdso built as part of the 64 bits kernel, we are
using a modified macro that generates:
.long 0xffffffff
.llong label - 1b
Note that is assumes that the value is negative which is enforced by
the .lds (those offsets are always negative as the .text is always
before the fixup table and gas doesn't support emiting the reloc the
other way around).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds some macros that can be used with an explicit label in
order to nest cpu features. This should be used very careful but is
necessary for the upcoming cell TB fixup.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There are currently two versions of the functions for applying the
feature fixups, one for CPU features and one for firmware features. In
addition, they are both in assembly and with separate implementations
for 32 and 64 bits. identify_cpu() is also implemented in assembly and
separately for 32 and 64 bits.
This patch replaces them with a pair of C functions. The call sites are
slightly moved on ppc64 as well to be called from C instead of from
assembly, though it's a very small change, and thus shouldn't cause any
problem.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When I generalized __assign_irq_vector I failed to pay attention
to what happens when you access a per cpu data structure for
a cpu that is not online. It is an undefined case making any
code that does it have undefined behavior as well.
The code still needs to be able to allocate a vector across cpus
that are not online to properly handle combinations like lowest
priority interrupt delivery and cpu_hotplug. Not that we can do
that today but the infrastructure shouldn't prevent it.
So this patch updates the places where we touch per cpu data
to only touch online cpus, it makes cpu vector allocation
an atomic operation with respect to cpu hotplug, and it updates
the cpu start code to properly initialize vector_irq so we
don't have inconsistencies.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
The PXA255 has 84 GPIO lines available. This patch allows access to 81-84
Signed-off-by: Craig Hughes <craig@gumstix.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Eliminate more __must_check madness.
The return code from device_for_each_child() depends on the values
which the helper function returns. If the helper function always
returns zero, it's utterly pointless to check the return code from
device_for_each_child().
The only code which knows if the return value should be checked is
the caller itself, so forcing the return code to always be checked
is silly. Hence, remove the __must_check annotation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/perex/alsa:
[ALSA] hda-intel - Add check of MSI availabity
[ALSA] version 1.0.13
[ALSA] Fix addition of user-defined boolean controls
[ALSA] Fix AC97 power-saving mode
[ALSA] Fix re-use of va_list
[ALSA] hda_intel: add ATI RS690 HDMI audio support
[ALSA] hda-codec - Add model entry for ASUS U5F laptop
[ALSA] Fix dependency of snd-adlib driver in Kconfig
[ALSA] Various fixes for suspend/resume of ALSA PCI drivers
[ALSA] hda-codec - Fix assignment of PCM devices for Realtek codecs
[ALSA] sound/isa/opti9xx/opti92x-ad1848.c: check kmalloc() return value
[ALSA] sound/isa/ad1816a/ad1816a.c: check kmalloc() return value
[ALSA] sound/isa/cmi8330.c: check kmalloc() return value
[ALSA] sound/isa/gus/interwave.c: check kmalloc() return value
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[PKT_SCHED] netem: Orphan SKB when adding to queue.
[NET]: kernel-doc fix for sock.h
[NET]: Reduce sizeof(struct flowi) by 20 bytes.
[IPv6] fib: initialize tb6_lock in common place to give lockdep a key
[ATM] nicstar: Fix a bogus casting warning
[ATM] firestream: handle thrown error
[ATM]: No need to return void
[ATM]: handle sysfs errors
[DCCP] ipv6: Fix opt_skb leak.
[DCCP]: Fix Oops in DCCPv6
Otherwise we get a ton of unaligned exceptions, for cases such
as compat_sys_msgrcv() which go:
p = compat_alloc_user_space(second + sizeof(struct msgbuf));
and here 'second' can for example be an arbitrary odd value.
Based upon a bug report from Jurij Smakov.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warning in include/net/sock.h:
Warning(/var/linsrc/linux-2619-rc1-pv//include/net/sock.h:894): No description found for parameter 'rcu'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
As suggested by David, just kill off some unused fields in dnports to
reduce sizef(struct flowi). If they come back, they should be moved to
nl_u.dn_u in order not to enlarge again struct flowi
[ Modified to really delete this stuff instead of using #if 0. -DaveM ]
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current implementation uses a sequence of a cacheflush and a copy.
This is racy in case of a multithreaded debuggee and renders GDB
virtually unusable.
Aside this fixes a performance hog rendering access to /proc/cmdline very
slow and resulting in a enough cache stalls for the 34K AP/SP programming
model to make the bare metal code on the non-Linux VPE miss RT deadlines.
The main part of this patch was originally written by Ralf Baechle;
Atushi Nemoto did the the debugging.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[PATCH] libata-sff: Allow for wacky systems
[PATCH] ahci: readability tweak
[PATCH] libata: typo fix
[PATCH] ATA must depend on BLOCK
[PATCH] libata: use correct map_db values for ICH8
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (22 commits)
[PATCH] ibmveth: Fix index increment calculation
[PATCH] Fix timer race
[PATCH] Remove useless comment from sb1250
[PATCH] ucc_geth: changes to ucc_geth driver as a result of qe_lib changes and bugfixes
[PATCH] sky2: 88E803X transmit lockup
[PATCH] e1000: Reset all functions after a PCI error
[PATCH] WAN/pc300: handle, propagate minor errors
[PATCH] Update smc91x driver with ARM Versatile board info
[PATCH] wireless: WE-20 compatibility for ESSID and NICKN ioctls
[PATCH] zd1211rw: fix build-break caused by association race fix
[PATCH] sotftmac: fix a slab corruption in WEP restricted key association
[PATCH] airo: check if need to freeze
[PATCH] wireless: More WE-21 potential overflows...
[PATCH] zd1201: Possible NULL dereference
[PATCH] orinoco: fix WE-21 buffer overflow
[PATCH] airo.c: check returned values
[PATCH] bcm43xx-softmac: Fix system hang for x86-64 with >1GB RAM
[PATCH] bcm43xx-softmac: check returned value from pci_enable_device
[PATCH] softmac: Fix WX and association related races
[PATCH] bcm43xx: fix race condition in periodic work handler
...
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
[PATCH] x86-64: Revert timer routing behaviour back to 2.6.16 state
[PATCH] x86-64: Overlapping program headers in physical addr space fix
[PATCH] x86-64: Put more than one cpu in TARGET_CPUS
[PATCH] x86: Revert new unwind kernel stack termination
[PATCH] x86-64: Use irq_domain in ioapic_retrigger_irq
[PATCH] i386: Disable nmi watchdog on all ThinkPads
[PATCH] x86-64: Revert interrupt backlink changes
[PATCH] x86-64: Fix ENOSYS in system call tracing
[PATCH] i386: Fix fake return address
[PATCH] x86-64: x86_64 add NX mask for PTE entry
[PATCH] x86-64: Speed up dwarf2 unwinder
[PATCH] x86: Use -maccumulate-outgoing-args
[PATCH] x86-64: fix page align in e820 allocator
[PATCH] x86-64: Fix for arch/x86_64/pci/Makefile CFLAGS
[PATCH] i386: fix .cfi_signal_frame copy-n-paste error
[PATCH] x86-64: typo in __assign_irq_vector when updating pos for vector and offset
[PATCH] x86-64: x86_64 hot-add memory srat.c fix
[PATCH] i386: Update defconfig
[PATCH] x86-64: Update defconfig
Mistyped an ifdef CONFIG_CPUSETS - fixed.
I doubt that anyone ever noticed. The impact of this typo was
that if someone:
1) was using MPOL_BIND to force off node allocations
2) while using cpusets to constrain memory placement
3) when that cpuset was migrating that jobs memory
4) while the tasks in that job were actively forking
then there was a rare chance that future allocations using
that MPOL_BIND policy would be node local, not off node.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Reintroduce NODES_SPAN_OTHER_NODES for powerpc
Revert "[PATCH] Remove SPAN_OTHER_NODES config definition"
This reverts commit f62859bb68.
Revert "[PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES"
This reverts commit a94b3ab7ea.
Also update the comments to indicate that this is still required
and where its used.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Kravetz <kravetz@us.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We seem to have lost the declaration of pci_get_device_reverse(), if we ever
had one.
Add a CONFIG_PCI=0 stub too.
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
And a couple of bug fixes found by sparse.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Includes a couple of bugfixes found by sparse.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
.. so that you can use bitmaps with 32bit userspace on a 64 bit kernel.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] Remove SUID when splicing into an inode
[PATCH] Add lockless helpers for remove_suid()
[PATCH] Introduce generic_file_splice_write_nolock()
[PATCH] Take i_mutex in splice_from_pipe()