Commit Graph

60073 Commits

Author SHA1 Message Date
Alexey Dobriyan
786d7e1612 Fix rmmod/read/write races in /proc entries
Fix following races:
===========================================
1. Write via ->write_proc sleeps in copy_from_user(). Module disappears
   meanwhile. Or, more generically, system call done on /proc file, method
   supplied by module is called, module dissapeares meanwhile.

   pde = create_proc_entry()
   if (!pde)
	return -ENOMEM;
   pde->write_proc = ...
				open
				write
				copy_from_user
   pde = create_proc_entry();
   if (!pde) {
	remove_proc_entry();
	return -ENOMEM;
	/* module unloaded */
   }
				*boom*
==========================================
2. bogo-revoke aka proc_kill_inodes()

  remove_proc_entry		vfs_read
  proc_kill_inodes		[check ->f_op validness]
				[check ->f_op->read validness]
				[verify_area, security permissions checks]
	->f_op = NULL;
				if (file->f_op->read)
					/* ->f_op dereference, boom */

NOTE, NOTE, NOTE: file_operations are proxied for regular files only. Let's
see how this scheme behaves, then extend if needed for directories.
Directories creators in /proc only set ->owner for them, so proxying for
directories may be unneeded.

NOTE, NOTE, NOTE: methods being proxied are ->llseek, ->read, ->write,
->poll, ->unlocked_ioctl, ->ioctl, ->compat_ioctl, ->open, ->release.
If your in-tree module uses something else, yell on me. Full audit pending.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:39 -07:00
Alan Cox
5568b0e802 v850: enable arbitary speed tty ioctls
Adding the defines/macros activates the existing code in the tty layer
and allows this platform to use the arbitary speed ioctl setting layer

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:39 -07:00
Jeff Dike
23f465b728 uml: remove dead file
I forgot this file was here.  It hasn't been used since UML has been in
mainline.

Thanks to Jesper for finding something that needed doing to it, thus reminding
me of its existence.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:39 -07:00
Jeff Dike
8619b86f20 uml: limit request size on COWed devices
COWed devices can't handle more than 32 (64 on x86_64) sectors in one request
due to the size of the bitmap being carried around in the io_thread_req.

Enforce that by telling the block layer not to put too many sectors in
requests to COWed devices.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:39 -07:00
Jeff Dike
e8234483e7 uml: export hostfs symbols
Add some exports for hostfs that are required after Alberto Bertogli's fixes
for accessing unlinked host files.

Also did some style cleanups while I was here.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:39 -07:00
Jeff Dike
e4c4bf9968 uml: Eliminate kernel allocator wrappers
UML had two wrapper procedures for kmalloc, um_kmalloc and um_kmalloc_atomic
because the flag constants weren't available in userspace code.
kern_constants.h had made kernel constants available for a long time, so there
is no need for these wrappers any more.  Rather, userspace code calls kmalloc
directly with the userspace versions of the gfp flags.

kmalloc isn't a real procedure, so I had to essentially copy the inline
wrapper around __kmalloc.

vmalloc also had its own wrapper for no good reason.  This is now gone.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Jeff Dike
c43990162f uml: simplify helper stack handling
run_helper and run_helper_thread had arguments which were the same in all
callers.  run_helper's stack_out was always NULL and run_helper_thread's
stack_order was always 0.  These are now gone, and the constants folded
into the code.

Also fixed leaks of the helper stack in the AIO and SIGIO code.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Jeff Dike
42a359e31a uml: SIGIO support cleanup
Cleanup of the SIGWINCH support.

Some code and comment reformatting.

The stack used for SIGWINCH threads was leaked.  This is now fixed by storing
it with the pid and other information, and freeing it when the thread is
killed.

If something goes wrong with a WIGWINCH thread, and this is discovered in the
interrupt handler, the winch record would leak.  It is now freed, except that
the IRQ isn't freed.  This is hard to do from interrupt context.  This has the
side-effect that the IRQ system maintains a reference to the freed structure,
but that shouldn't cause a problem since the descriptor is disabled.

register_winch_irq is now much better about cleaning up after an
initialization failure.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Jeff Dike
d14ad81f80 uml: handle errors on opening host side of consoles
If the host side of a console can't be opened, this will now produce visible
error messages.

enable_chan now returns a status and this is passed up to con_open and
ssl_open, which will complain if anything went wrong.

The default host device for the serial line driver is now a pts device rather
than a pty device since lots of hosts have LEGACY_PTYS disabled.  This had
always been failing on such hosts, but silently.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Jeff Dike
75886f21e3 uml: pty channel tidying
Cleanup, mostly style violations.

Tidied the includes.

getmaster returns a real errno, which pty_open returns if there's a
problem.

The printks now have severity.

Changed os_* calls to call libc directly.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Jeff Dike
63920f4717 uml: xterm driver tidying
Major tidying of the xterm console driver:
	got rid of the tt-mode gdb support
	tidied up the includes
	fixed lots of style violations
	replaced os_* calls with glibc calls in xterm.c
	all printk calls now have a severity indicator
	the error paths of xterm_open are closer to being right

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Eduard-Gabriel Munteanu
89df6bfc04 uml: DEBUG_SHIRQ fixes
DEBUG_SHIRQ generates spurious interrupts, triggering handlers such as
mconsole_interrupt() or line_interrupt().  They expect data to be available to
be read from their sockets/pipes, but in the case of spurious interrupts, the
host didn't actually send anything, so UML hangs in read() and friends.
Setting those fd's as O_NONBLOCK makes DEBUG_SHIRQ-enabled UML kernels boot
and run correctly.

Signed-off-by: Eduard-Gabriel Munteanu <maxdamage@aladin.ro>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Jeff Dike
e18eecb8b3 Add generic exit-time stack-depth checking to CONFIG_DEBUG_STACK_USAGE
Add generic exit-time stack-depth checking to CONFIG_DEBUG_STACK_USAGE.

This also adds UML support.

Tested on UML and i386.

[akpm@linux-foundation.org: cleanups, speedups, tweaks]
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Jeff Dike
84812217e3 uml: use get_free_pages to allocate kernel stacks
For some reason, I was using kmalloc instead of get_free_pages for kernel
stacks.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Jeff Dike
0a6d3a2a38 uml: fix request->sector update
It is theoretically possible for a request to finish and be freed between
writing it to the I/O thread and updating the sector count.  In this case, the
update will dereference a freed pointer.

To avoid this, I delay the update until processing the next sg segment, when
the request pointer is known to be good.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Robert P. J. Day
7ff9057db7 CRIS: replace old-style member inits with designated inits
Replace the old-style structure member initializers with designated
initializers.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Alan Cox
e30afd5119 etrax: enable arbitary speed setting on tty ports
Add the needed constants and bits. The actual code is already in the tty
layer and turned on by the definitions

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Hirokazu Takata
c12a54b3af m32r: A MAINTAINERS entry for the M32R architecture
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Alan Cox
86245b8507 m32r: enable arbitary speed tty rate setting
Add the defines and constants needed for the M32R platform to support the
arbitary speed tty ioctls.

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Mariusz Kozlowski
f47807f359 arm26: remove broken and unused macro
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Cc: Ian Molton <spyro@f2s.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Alan Cox
787ea0ef70 ARM26: enable arbitary speed tty ioctls and split input/output speed
Add the ioctls and values needed for this to the ARM26/ARM32 ports.  The
actual code has been in the base kernel for a while and automatically turns
on when a port sets the required defines.

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:38 -07:00
Ivan Kokshaysky
5c6af69abe fix alpha ISA support
isa_bus_to_virt() is still needed in a few places (lance.c, at least).  When
we switch the kernel to using -Werror-implicit-function-declaration, the lack
of isa_bus_to_virt() breaks alpha allmodconfig builds.

Add isa_bus_to_virt() and deprecate the ezisting ISA APIs, though it might be
better to define these functions as BUG(), since virt_to_bus/bus_to_virt just
do wrong things on a number of machines.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Sam Ravnborg
ebaf4fc13e alpha: fix trivial section mismatch warnings
Fix the following section mismatch warnings:
WARNING: arch/alpha/kernel/built-in.o(.text+0x7c78): Section mismatch: reference to .init.text:init_rtc_irq (between 'common_init_rtc' and 'timer_interrupt')
WARNING: arch/alpha/kernel/built-in.o(.text+0x7c7c): Section mismatch: reference to .init.text:init_rtc_irq (between 'common_init_rtc' and 'timer_interrupt')
WARNING: arch/alpha/kernel/built-in.o(.data+0x2c30): Section mismatch: reference to .init.text:srm_console_setup (between 'srmcons' and 'tsunami_pci_ops')

In all three cases functions marked __init was called outside __init context.
So the fix was to just drop the __init attribute.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Meelis Roos <mroos@linux.ee>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Yoshinori Sato
2fea299f74 h8300 entry.S update
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Yoshinori Sato
542f739d12 h8300: remove unused file
arch/h8300/kernel/ints.c is unused.

Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Yoshinori Sato
86277d5949 h8300 zImage support update
- Add missing files
- Add Makefile target
- Change image base
- Style fix

Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Alan Cox
f79224ca27 h8300: enable arbitary speed tty port setup
Add the needed constants and defines to activate the new tty code on this
platform

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Greg Ungerer
17d8725d36 m68knommu: remove old cache management cruft from mm code
Remove cache management cruft.  This code is dead, all the cache manangement
functions for the ColdFire exist in the header file
include/asm-m68knommu/cacheflush.h.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Greg Ungerer
2f75d1098c m68knommu: remove cruft from setup code
Clean out cruft.

. remove include files not needed
. remove not used CAT_ROMARRAY code
. remove generic machine pointers not used
. remove unused functions
. fix email address in copyrights

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Greg Ungerer
45a82f5198 m68knommu: use TRHEAD_SIZE instead of hard constant
Use THREAD_SIZE instead of a hard constant.

Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Greg Ungerer
57c8f63e8e nommu: stub expand_stack() for nommu case
Be consistent with VM mmap, implement expand_stack().  We can't actually do
anything other than return an error in the no MMU case though.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
David Howells
54a3bdd76e FRV: Remove some dead code
Remove some dead chunks of code that are bounded by preprocessor conditionals
controlled by apparently no-longer available config options.

These are:

	CONFIG_BLK_DEV_BLKMEM
	CONFIG_CHR_DEV_FLASH
	CONFIG_BLK_DEV_FLASH
	CONFIG_CONSOLE

[Found by Robert P. J. Day <rpjday@mindspring.com>]

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
David Howells
769259160a FRV: Be (self-)consistent and use CONFIG_GDB_CONSOLE everywhere
Be (self-)consistent and use CONFIG_GDB_CONSOLE everywhere rather than using
CONFIG_GDBSTUB_CONSOLE in some places and not others.  This is also then
consistent with other archs.

Also remove the gdbstub console device() op which doesn't seem to be necessary
now (especially as it doesn't compile).

[Found by Robert P. J. Day <rpjday@mindspring.com>]

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
David Howells
c1a39e0505 FRV: Connect up new syscalls
Connect up new system calls.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Miklos Szeredi
0165ab4435 split mmap
This is a straightforward split of do_mmap_pgoff() into two functions:

 - do_mmap_pgoff() checks the parameters, and calculates the vma
   flags.  Then it calls

 - mmap_region(), which does the actual mapping

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
akpm@linux-foundation.org
c44939ecb6 NeilBrown <neilb@suse.de>
The do_loop_readv_writev implementation of readv breaks out of the loop as
soon as a single read request didn't fill it's buffer:

		if (nr != len)
			break;

The generic_file_aio_read version doesn't.  So if it hits EOF before the end
of the list of buffers, it will try again on the next buffer.  If the file was
extended in the mean time, this will produce a bad result.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Herbert van den Bergh
5ed44a401d do not limit locked memory when RLIMIT_MEMLOCK is RLIM_INFINITY
Fix a bug in mm/mlock.c on 32-bit architectures that prevents a user from
locking more than 4GB of shared memory, or allocating more than 4GB of
shared memory in hugepages, when rlim[RLIMIT_MEMLOCK] is set to
RLIM_INFINITY.

Signed-off-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Acked-by: Chris Mason <chris.mason@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:37 -07:00
Paul Mundt
84a01c2f8e slob: sparsemem support
Currently slob is disabled if we're using sparsemem, due to an earlier
patch from Goto-san.  Slob and static sparsemem work without any trouble as
it is, and the only hiccup is a missing slab_is_available() in the case of
sparsemem extreme.  With this, we're rid of the last set of restrictions
for slob usage.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Hugh Dickins
5dc4ac6324 mspec_mmap: don't set VM_LOCKED
mspec_mmap was setting VM_LOCKED (without adjusting locked_vm): don't do
that, it serves no purpose in 2.6, other than to mess up the locked_vm
accounting - mspec's pages won't get reclaimed anyway.  Thanks to Dmitry
Monakhov for raising the issue.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Jes Sorensen <jes@sgi.com>
Cc: Dmitry Monakhov <dmonakhov@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Dan Aloni
b49ad484c5 mm/page_alloc.c: lower printk severity
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Paul Mundt
6193a2ff18 slob: initial NUMA support
This adds preliminary NUMA support to SLOB, primarily aimed at systems with
small nodes (tested all the way down to a 128kB SRAM block), whether
asymmetric or otherwise.

We follow the same conventions as SLAB/SLUB, preferring current node
placement for new pages, or with explicit placement, if a node has been
specified.  Presently on UP NUMA this has the side-effect of preferring
node#0 allocations (since numa_node_id() == 0, though this could be
reworked if we could hand off a pfn to determine node placement), so
single-CPU NUMA systems will want to place smaller nodes further out in
terms of node id.  Once a page has been bound to a node (via explicit node
id typing), we only do block allocations from partial free pages that have
a matching node id in the page flags.

The current implementation does have some scalability problems, in that all
partial free pages are tracked in the global freelist (with contention due
to the single spinlock).  However, these are things that are being reworked
for SMP scalability first, while things like per-node freelists can easily
be built on top of this sort of functionality once it's been added.

More background can be found in:

	http://marc.info/?l=linux-mm&m=118117916022379&w=2
	http://marc.info/?l=linux-mm&m=118170446306199&w=2
	http://marc.info/?l=linux-mm&m=118187859420048&w=2

and subsequent threads.

Acked-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Jason Baron
f797779324 speed up madvise_need_mmap_write() usage
In the new madvise_need_mmap_write() call we can avoid an extra case
statement and function call as follows.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Adrian Bunk
897e679b17 mm/slab.c: start_cpu_timer() should be __cpuinit
start_cpu_timer() should be __cpuinit (which also matches what it's
callers are).

__devinit didn't cause problems, it simply wasted a few bytes of memory
for the common CONFIG_HOTPLUG_CPU=n case.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Paul Mundt
6ea6e6887d mm: more __meminit annotations
Currently zone_spanned_pages_in_node() and zone_absent_pages_in_node() are
non-static for ARCH_POPULATES_NODE_MAP and static otherwise.  However, only
the non-static versions are __meminit annotated, despite only being called
from __meminit functions in either case.

zone_init_free_lists() is currently non-static and not __meminit annotated
either, despite only being called once in the entire tree by
init_currently_empty_zone(), which too is __meminit.  So make it static and
properly annotated.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Jan Beulich
8f0accc862 kill vmalloc_earlyreserve
This symbol got orphaned quite a while ago.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Jan Beulich
45e98cdb6d page table handling cleanup
Kill pte_rdprotect(), pte_exprotect(), pte_mkread(), pte_mkexec(), pte_read(),
pte_exec(), and pte_user() except where arch-specific code is making use of
them.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Jan Beulich
98011f569e mm: fix improper .init-type section references
.. which modpost started warning about.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Paul Mundt
140d5a4904 numa: mempolicy: trivial debug fixes.
Enabling debugging fails to build due to the nodemask variable in
do_mbind() having changed names, and then oopses on boot due to the
assumption that the nodemask can be dereferenced -- which doesn't work out
so well when the policy is changed to MPOL_DEFAULT with a NULL nodemask by
numa_default_policy().

This fixes it up, and switches from PDprintk() to pr_debug() while
we're at it.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Ethan Solomita
462e00cc71 oom: stop allocating user memory if TIF_MEMDIE is set
get_user_pages() can try to allocate a nearly unlimited amount of memory on
behalf of a user process, even if that process has been OOM killed.  The
OOM kill occurs upon return to user space via a SIGKILL, but
get_user_pages() will try allocate all its memory before returning.  Change
get_user_pages() to check for TIF_MEMDIE, and if set then return
immediately.

Signed-off-by: Ethan Solomita <solo@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Paul Mundt
b71636e298 numa: mempolicy: dynamic interleave map for system init
This converts the default system init memory policy to use a dynamically
created node map instead of defaulting to all online nodes.  Nodes of a
certain size (>= 16MB) are judged to be suitable for interleave, and are added
to the map.  If all nodes are smaller in size, the largest one is
automatically selected.

Without this, tiny nodes find themselves out of memory before we even make it
to userspace.  Systems with large nodes will notice no change.

Only the system init policy is effected by this change, the regular
MPOL_DEFAULT policy is still switched to later on in the boot process as
normal.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00