Commit Graph

730 Commits

Author SHA1 Message Date
David Chinner
a7430847fc [XFS] Fix broken inode cluster setup.
The radix tree based inode caches did away with the inode cluster hashes,
replacing them with a bunch of masking and gang lookups on the radix tree.

This masking got broken when moving the code to per-ag radix trees and
indexing by agino # rather than straight inode number. The result is
clustered inode writeback does not cluster and things can go extremely
slowly when there are lots of inodes to write.

Fix it up by comparing the agino # of the inode we just looked up to the
index of the cluster we are looking for.

Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com>

SGI-PV: 972915
SGI-Modid: xfs-linux-melb:xfs-kern:30033a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2007-12-10 13:46:59 +11:00
Lachlan McIlroy
77be55a5a1 [XFS] Clear XBF_READ_AHEAD flag on I/O completion.
SGI-PV: 972554
SGI-Modid: xfs-linux-melb:xfs-kern:30128a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2007-12-10 13:46:45 +11:00
Lachlan McIlroy
d1afb678ce [XFS] Fixed a few bugs in xfs_buf_associate_memory()
- calculation of 'page_count' was incorrect as it did not
  consider the offset of 'mem' into the first page. The
  logic to bump 'page_count' didn't work if 'len' was <=
  PAGE_CACHE_SIZE (ie offset = 3k, len = 2k).
- setting b_buffer_length to 'len' is incorrect if 'offset'
  is > 0. Set it to the total length of the buffer.
- I suspect that passing a non-aligned address into
  mem_to_page() for the first page may have been causing
  issues - don't know but just tidy up that code anyway.

SGI-PV: 971596
SGI-Modid: xfs-linux-melb:xfs-kern:30143a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2007-12-10 13:46:20 +11:00
Lachlan McIlroy
cd57e594ad [XFS] 971064 Various fixups for xfs_bulkstat().
- sanity check for NULL user buffer in xfs_ioc_bulkstat[_compat]()
- remove the special case for XFS_IOC_FSBULKSTAT with count == 1. This
  special case causes bulkstat to fail because the special case uses
  xfs_bulkstat_single() instead of xfs_bulkstat() and the two functions
  have different semantics.  xfs_bulkstat() will return the next inode
  after the one supplied while skipping internal inodes (ie quota inodes).
  xfs_bulkstate_single() will only lookup the inode supplied and return
  an error if it is an internal inode.
- in xfs_bulkstat(), need to initialise 'lastino' to the inode supplied
  so in cases were we return without examining any inodes the scan wont
  restart back at zero.
- sanity check for valid *ubcountp values. Cannot sanity check for valid
  ubuffer here because some users of xfs_bulkstat() don't supply a buffer.
- checks against 'ubleft' (the space left in the user's buffer) should be
  against 'statstruct_size' which is the supplied minimum object size.
  The mixture of checks against statstruct_size and 0 was one of the
  reasons we were skipping inodes.
- if the formatter function returns BULKSTAT_RV_NOTHING and an error and
  the error is not ENOENT or EINVAL then we need to abort the scan. ENOENT
  is for inodes that are no longer valid and we just skip them. EINVAL is
  returned if we try to lookup an internal inode so we skip them too. For
  a DMF scan if the inode and DMF attribute cannot fit into the space left
  in the user's buffer it would return ERANGE. We didn't handle this error
  and skipped the inode. We would continue to skip inodes until one fitted
  into the user's buffer or we completed the scan.
- put back the recalculation of agino (that got removed with the last fix)
  at the end of the while loop. This is because the code at the start of
  the loop expects agino to be the last inode examined if it is non-zero.
- if we found some inodes but then encountered an error, return success
  this time and the error next time. If the formatter aborted with ENOMEM
  we will now return this error but only if we couldn't read any inodes.
  Previously if we encountered ENOMEM without reading any inodes we
  returned a zero count and no error which falsely indicated the scan was
  complete.

SGI-PV: 973431
SGI-Modid: xfs-linux-melb:xfs-kern:30089a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2007-12-10 13:44:11 +11:00
Donald Douwsma
d757762bf2 [XFS] Fix dbflush panic in xfs_qm_sync.
The recent behaviour layer removal dropped the check for quotas that have
been requested at mount time but have subsequently been turned off. This
results in a panic when accessing m_quotainfo which has been freed.

This patch adds the check originally made by xfs_qm_syncall() to
xfs_qm_sync().

SGI-PV: 969769
SGI-Modid: xfs-linux-melb:xfs-kern:29908a

Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2007-12-10 13:40:10 +11:00
Christoph Hellwig
3965516440 exportfs: make struct export_operations const
Now that nfsd has stopped writing to the find_exported_dentry member we an
mark the export_operations const

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <linux-ext4@vger.kernel.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:21 -07:00
Christoph Hellwig
c38344fe9e xfs: new export ops
This one is a lot more complicated than the previous ones.  XFS already had a
very clever scheme for supporting 64bit inode numbers in filehandles, and I've
reworked this to be some kind of a prototype for the generic 64bit inode
filehandle support.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Neil Brown <neilb@suse.de>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:20 -07:00
Christoph Hellwig
c6143911a7 [XFS] cleanup fid types mess
Currently XFs has three different fid types: struct fid, struct xfs_fid
and struct xfs_fid2 with hte latter two beeing identicaly and the first
one beeing the same size but an unstructured array with the same size.

This patch consolidates all this to alway uuse struct xfs_fid.

This patch is required for an upcoming patch series from me that revamps
the nfs exporting code and introduces a Linux-wide struct fid.

SGI-PV: 970336
SGI-Modid: xfs-linux-melb:xfs-kern:29651a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-19 18:02:55 +10:00
Christoph Hellwig
c8fcfac5a2 [XFS] fixups after behavior removal merge into mainline git
Fixup for lack of dmapi support and no quota module support.

SGI-PV: 969985

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-19 17:14:45 +10:00
Linus Torvalds
347c53dca7 Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: (59 commits)
  [XFS] eagerly remove vmap mappings to avoid upsetting Xen
  [XFS] simplify validata_fields
  [XFS] no longer using io_vnode, as was remaining from 23 cherrypick
  [XFS] Remove STATIC which was missing from prior manual merge
  [XFS] Put back the QUEUE_ORDERED_NONE test in the barrier check.
  [XFS] Turn off XBF_ASYNC flag before re-reading superblock.
  [XFS] avoid race in sync_inodes() that can fail to write out all dirty data
  [XFS] This fix prevents bulkstat from spinning in an infinite loop.
  [XFS] simplify xfs_create/mknod/symlink prototype
  [XFS] avoid xfs_getattr in XFS_IOC_FSGETXATTR ioctl
  [XFS] get_bulkall() could return incorrect inode state
  [XFS] Kill unused IOMAP_EOF flag
  [XFS] fix when DMAPI mount option processing happens
  [XFS] ensure file size is logged on synchronous writes
  [XFS] growlock should be a mutex
  [XFS] replace some large xfs_log_priv.h macros by proper functions
  [XFS] kill struct bhv_vfs
  [XFS] move syncing related members from struct bhv_vfs to struct xfs_mount
  [XFS] kill the vfs_flags member in struct bhv_vfs
  [XFS] kill the vfs_fsid and vfs_altfsid members in struct bhv_vfs
  ...
2007-10-17 09:04:11 -07:00
Joern Engel
1c0eeaf569 introduce I_SYNC
I_LOCK was used for several unrelated purposes, which caused deadlock
situations in certain filesystems as a side effect.  One of the purposes
now uses the new I_SYNC bit.

Also document the various bits and change their order from historical to
logical.

[bunk@stusta.de: make fs/inode.c:wake_up_inode() static]
Signed-off-by: Joern Engel <joern@wohnheim.fh-wedel.de>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: David Chinner <dgc@sgi.com>
Cc: Anton Altaparmakov <aia21@cam.ac.uk>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:02 -07:00
Fengguang Wu
1f7decf6d9 writeback: remove pages_skipped accounting in __block_write_full_page()
Miklos Szeredi <miklos@szeredi.hu> and me identified a writeback bug:

> The following strange behavior can be observed:
>
> 1. large file is written
> 2. after 30 seconds, nr_dirty goes down by 1024
> 3. then for some time (< 30 sec) nothing happens (disk idle)
> 4. then nr_dirty again goes down by 1024
> 5. repeat from 3. until whole file is written
>
> So basically a 4Mbyte chunk of the file is written every 30 seconds.
> I'm quite sure this is not the intended behavior.

It can be produced by the following test scheme:

# cat bin/test-writeback.sh
grep nr_dirty /proc/vmstat
echo 1 > /proc/sys/fs/inode_debug
dd if=/dev/zero of=/var/x bs=1K count=204800&
while true; do grep nr_dirty /proc/vmstat; sleep 1; done

# bin/test-writeback.sh
nr_dirty 19207
nr_dirty 19207
nr_dirty 30924
204800+0 records in
204800+0 records out
209715200 bytes (210 MB) copied, 1.58363 seconds, 132 MB/s
nr_dirty 47150
nr_dirty 47141
nr_dirty 47142
nr_dirty 47142
nr_dirty 47142
nr_dirty 47142
nr_dirty 47205
nr_dirty 47214
nr_dirty 47214
nr_dirty 47214
nr_dirty 47214
nr_dirty 47214
nr_dirty 47215
nr_dirty 47216
nr_dirty 47216
nr_dirty 47216
nr_dirty 47154
nr_dirty 47143
nr_dirty 47143
nr_dirty 47143
nr_dirty 47143
nr_dirty 47143
nr_dirty 47142
nr_dirty 47142
nr_dirty 47142
nr_dirty 47142
nr_dirty 47134
nr_dirty 47134
nr_dirty 47135
nr_dirty 47135
nr_dirty 47135
nr_dirty 46097 <== -1038
nr_dirty 46098
nr_dirty 46098
nr_dirty 46098
[...]
nr_dirty 46091
nr_dirty 46092
nr_dirty 46092
nr_dirty 45069 <== -1023
nr_dirty 45056
nr_dirty 45056
nr_dirty 45056
[...]
nr_dirty 37822
nr_dirty 36799 <== -1023
[...]
nr_dirty 36781
nr_dirty 35758 <== -1023
[...]
nr_dirty 34708
nr_dirty 33672 <== -1024
[...]
nr_dirty 33692
nr_dirty 32669 <== -1023

% ls -li /var/x
847824 -rw-r--r-- 1 root root 200M 2007-08-12 04:12 /var/x

% dmesg|grep 847824  # generated by a debug printk
[  529.263184] redirtied inode 847824 line 548
[  564.250872] redirtied inode 847824 line 548
[  594.272797] redirtied inode 847824 line 548
[  629.231330] redirtied inode 847824 line 548
[  659.224674] redirtied inode 847824 line 548
[  689.219890] redirtied inode 847824 line 548
[  724.226655] redirtied inode 847824 line 548
[  759.198568] redirtied inode 847824 line 548

# line 548 in fs/fs-writeback.c:
543                 if (wbc->pages_skipped != pages_skipped) {
544                         /*
545                          * writeback is not making progress due to locked
546                          * buffers.  Skip this inode for now.
547                          */
548                         redirty_tail(inode);
549                 }

More debug efforts show that __block_write_full_page()
never has the chance to call submit_bh() for that big dirty file:
the buffer head is *clean*. So basicly no page io is issued by
__block_write_full_page(), hence pages_skipped goes up.

Also the comment in generic_sync_sb_inodes():

544                         /*
545                          * writeback is not making progress due to locked
546                          * buffers.  Skip this inode for now.
547                          */

and the comment in __block_write_full_page():

1713                 /*
1714                  * The page was marked dirty, but the buffers were
1715                  * clean.  Someone wrote them back by hand with
1716                  * ll_rw_block/submit_bh.  A rare case.
1717                  */

do not quite agree with each other. The page writeback should be skipped for
'locked buffer', but here it is 'clean buffer'!

This patch fixes this bug. Though I'm not sure why __block_write_full_page()
is called only to do nothing and who actually issued the writeback for us.

This is the two possible new behaviors after the patch:

1) pretty nice: wait 30s and write ALL:)
2) not so good:
	- during the dd: ~16M
	- after 30s:      ~4M
	- after 5s:       ~4M
	- after 5s:     ~176M

The next patch will fix case (2).

Cc: David Chinner <dgc@sgi.com>
Cc: Ken Chen <kenchen@google.com>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:43:02 -07:00
Christoph Lameter
4ba9b9d0ba Slab API: remove useless ctor parameter and reorder parameters
Slab constructors currently have a flags parameter that is never used.  And
the order of the arguments is opposite to other slab functions.  The object
pointer is placed before the kmem_cache pointer.

Convert

        ctor(void *object, struct kmem_cache *s, unsigned long flags)

to

        ctor(struct kmem_cache *s, void *object)

throughout the kernel

[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:45 -07:00
Jeremy Fitzhardinge
7f01507234 [XFS] eagerly remove vmap mappings to avoid upsetting Xen
XFS leaves stray mappings around when it vmaps memory to make it virtually
contigious. This upsets Xen if one of those pages is being recycled into a
pagetable, since it finds an extra writable mapping of the page.

This patch solves the problem in a brute force way, by making XFS always
eagerly unmap its mappings.

SGI-PV: 971902
SGI-Modid: xfs-linux-melb:xfs-kern:29886a

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-17 14:14:35 +10:00
Christoph Hellwig
6572bc28de [XFS] simplify validata_fields
Stop using xfs_getattr and a onstack bhv_vattr_t just to get three fields
from the underlying inode and opencode copying from the inode fields
instead.

SGI-PV: 970662
SGI-Modid: xfs-linux-melb:xfs-kern:29711a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-17 11:10:14 +10:00
Nick Piggin
d79689c703 xfs: convert to new aops
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: David Chinner <dgc@sgi.com>
Cc: Timothy Shimmin <tes@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:55 -07:00
Tim Shimmin
150f29ef2e [XFS] no longer using io_vnode, as was remaining from 23 cherrypick
Because we cherrypicked SGI-Modid xfs-linux-melb:xfs-kern:29675a
and it depended on the sgi mod which removed io_vnode (which was
not cherrypicked in 23) it was hand modified.
This fixes things back up (to the originial mod) now we have moved
on again.

Reviewed-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 16:20:12 +10:00
Tim Shimmin
479ba36bbb [XFS] Remove STATIC which was missing from prior manual merge
Removes STATIC on xfs_freeze function which was not manually
applied for SGI-Modid: xfs-linux-melb:xfs-kern:29504a.

Reviewed-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 15:32:57 +10:00
Tim Shimmin
cd514bdaa8 [XFS] Put back the QUEUE_ORDERED_NONE test in the barrier check.
Put back the QUEUE_ORDERED_NONE test which caused us grief in sles when it
was taken out as, IIRC, it allowed md/lvm to be thought of as supporting
barriers when they weren't in some configurations. This patch will be
reverting what went in as part of a change for the SGI-pv 964544
(SGI-Modid: xfs-linux-melb:xfs-kern:28568a).

SGI-PV: 971783
SGI-Modid: xfs-linux-melb:xfs-kern:29882a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
2007-10-16 14:23:21 +10:00
Lachlan McIlroy
bebf963fec [XFS] Turn off XBF_ASYNC flag before re-reading superblock.
SGI-PV: 971603
SGI-Modid: xfs-linux-melb:xfs-kern:29871a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 14:22:39 +10:00
Lachlan McIlroy
e893bffd4c [XFS] avoid race in sync_inodes() that can fail to write out all dirty data
In xfs_fs_sync_super() treat a sync the same as a filesystem freeze. This
is needed to force the log to disk for inodes which are not marked dirty
in the Linux inode (the inodes are marked dirty on completion of the log
I/O) and so sync_inodes() will not flush them.

In xfs_fs_write_inode() a synchronous flush will not get an EAGAIN from
xfs_inode_flush() and if an asynchronous flush returns EAGAIN we should
pass it on to the caller. If we get an error while flushing the inode then
re-dirty it so we can try again later.

SGI-PV: 971670
SGI-Modid: xfs-linux-melb:xfs-kern:29860a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 14:22:28 +10:00
Lachlan McIlroy
c2cba57e83 [XFS] This fix prevents bulkstat from spinning in an infinite loop.
Here 'agino' increments through the inodes in an allocation group. At the
end of the innermost 'for' loop it will hold the value of the next inode
to look at (ie the first inode in the next cluster/chunk). Assigning
'lastino' to 'agino' resets it to the last inode in the last inode cluster
we just looked at. This causes us to look up the very same cluster and
examine all the inodes all over again, and again, and again...

We also want to set 'lastino' for the cases when we're not interested in
the inode so that the next call to bulkstat won't re-examine the same
uninteresting inodes.

SGI-PV: 971064
SGI-Modid: xfs-linux-melb:xfs-kern:29840a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 14:21:56 +10:00
Christoph Hellwig
3e5daf05a0 [XFS] simplify xfs_create/mknod/symlink prototype
Simplify the prototype for xfs_create/xfs_mkdir/xfs_symlink by not passing
down a bhv_vattr_t that just hogs stack space. Instead pass down the mode
in a mode_t and in case of xfs_create the rdev as a scalar type as well.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29794a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 14:15:32 +10:00
Christoph Hellwig
c83bfab1fa [XFS] avoid xfs_getattr in XFS_IOC_FSGETXATTR ioctl
No need to call into xfs_getattr and put a big bhv_vattr_t on the stack
just to get a little information from the XFS inode.

Add a helper called xfs_ioc_fsgetxattr instead that deals with retrieving
the information in a clean way.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29780a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:21:48 +10:00
Vlad Apostolov
859d718279 [XFS] get_bulkall() could return incorrect inode state
In the following scenario xfs_bulkstat() returns incorrect stale inode
state:

1. File_A is created and its inode synced to disk. 2. File_A is unlinked
and doesn't exist anymore. 3. Filesystem sync is invoked. 4. File_B is
created. File_B happens to reclaim File_A's inode. 5. xfs_bulkstat() is
called and detects File_B but reports the

incorrect File_A inode state.

Explanation for the incorrect inode state is that inodes are not
immediately synced on file create for performance reasons. This leaves the
on-disk inode buffer uninitialized (or with old state from a previous
generation inode) and this is what xfs_bulkstat() would report.

The patch marks the on-disk inode buffer "dirty" on unlink. When the inode
is reclaimed (by a new file create), xfs_bulkstat() would filter this
inode by the "dirty" mark. Once the inode is flushed to disk, the on-disk
buffer "dirty" mark is automatically removed and a following
xfs_bulkstat() would return the correct inode state.

Marking the on-disk inode buffer "dirty" on unlink is achieved by setting
the on-disk di_nlink field to 0. Note that the in-core di_nlink has
already been set to 0 and a corresponding transaction logged by
xfs_droplink(). This is an exception from the rule that any on-disk inode
buffer changes has to be followed by a disk write (inode flush).
Synchronizing the in-core to on-disk di_nlink values in advance (before
the actual inode flush to disk) should be fine in this case because the
inode is already unlinked and it would never change its di_nlink again for
this inode generation.

SGI-PV: 970842
SGI-Modid: xfs-linux-melb:xfs-kern:29757a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:21:15 +10:00
Christoph Hellwig
ba532a980b [XFS] Kill unused IOMAP_EOF flag
SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29705a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:20:54 +10:00
Vlad Apostolov
574342f4ad [XFS] fix when DMAPI mount option processing happens
Fix for a regression caused by a recent patch
that moved the DMAPI mount option processing inside xfs_parseargs(). The
DMAPI mount option used to be processed in the DMAPI module loaded before
xfs_parseargs() was invoked.

SGI-PV: 970451
SGI-Modid: xfs-linux-melb:xfs-kern:29683a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:20:39 +10:00
Lachlan McIlroy
5903c4956f [XFS] ensure file size is logged on synchronous writes
Synchronous writes currently log inode changes before syncing pages to
disk. Since the file size is updated on I/O completion we wont be writing
out the updated file size and if we crash the file will have the wrong
size. This change moves the logging after the syncing of the pages to
ensure we log the correct file size.

SGI-PV: 970334
SGI-Modid: xfs-linux-melb:xfs-kern:29649a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:18:38 +10:00
Christoph Hellwig
cc92e7ac8d [XFS] growlock should be a mutex
m_growlock only needs plain binary mutex semantics, so use a struct mutex
instead of a semaphore for it.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29512a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:18:09 +10:00
Christoph Hellwig
0adba5363c [XFS] replace some large xfs_log_priv.h macros by proper functions
... or in the case of XLOG_TIC_ADD_OPHDR remove a useless macro entirely.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29511a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:17:56 +10:00
Christoph Hellwig
b267ce9952 [XFS] kill struct bhv_vfs
Now that struct bhv_vfs doesn't have any members left we can kill it and
go directly from the super_block to the xfs_mount everywhere.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29509a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:17:27 +10:00
Christoph Hellwig
7439449670 [XFS] move syncing related members from struct bhv_vfs to struct xfs_mount
SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29508a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 12:16:35 +10:00
Christoph Hellwig
bd186aa901 [XFS] kill the vfs_flags member in struct bhv_vfs
All flags are added to xfs_mount's m_flag instead. Note that the 32bit
inode flag was duplicated in both of them, but only cleared in the mount
when it was not nessecary due to the filesystem beeing small enough. Two
flags are still required here - one to indicate the mount option setting,
and one to indicate if it applies or not.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29507a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:45:57 +10:00
Christoph Hellwig
0ce4cfd4f7 [XFS] kill the vfs_fsid and vfs_altfsid members in struct bhv_vfs
vfs_altfsid was just a pointer to mp->m_fixedfsid so we can trivially
replace it with the latter. vfs_fsid also was identical to m_fixedfsid
through rather obfuscated ways so we can kill it as well and simply its
only user.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29506a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:45:02 +10:00
Christoph Hellwig
745f691912 [XFS] call common xfs vfs-level helpers directly and remove vfs operations
Also remove the now dead behavior code.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29505a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:44:08 +10:00
Christoph Hellwig
48c872a9f3 [XFS] decontaminate vfs operations from behavior details
All vfs ops now take struct xfs_mount pointers and the behaviour related
glue is split out into methods of its own.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29504a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:43:55 +10:00
Christoph Hellwig
b09cc77109 [XFS] remove dependency of the quota module on behaviors
Mount options are now parsed by the main XFS module and rejected if quota
support is not available, and there are some new quota operation for the
quotactl syscall and calls to quote in the mount, unmount and sync
callchains.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29503a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:43:26 +10:00
Christoph Hellwig
293688ec42 [XFS] remove dependency of the dmapi module on behaviors
Mount options are now parsed by the main XFS module and rejected if dmapi
support is not available, and there is a new dm operation to send the
mount event.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29502a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:41:15 +10:00
Christoph Hellwig
f541d270db [XFS] move freeing the mount structure from xfs_mount_free into the callers
In the next patch we need to look at the mount structure until just before
it's freed, so we need to be able to free it as the very last thing in
xfs_unmount.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29501a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:40:52 +10:00
Christoph Hellwig
0a74cd1964 [XFS] kill struct bhv_vnode
Now that struct bhv_vnode is empty we can just kill it. Retain bhv_vnode_t
as a typedef for struct inode for the time being until all the fallout is
cleaned up.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29500a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:40:24 +10:00
Christoph Hellwig
2aeaa258c0 [XFS] kill the v_number member in struct bhv_vnode
It's entirely unused except for ignored arguments in the mrlock
initialization, so remove it.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29499a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:39:42 +10:00
Christoph Hellwig
1543d79c45 [XFS] move v_trace from bhv_vnode to xfs_inode
struct bhv_vnode is on it's way out, so move the trace buffer to the XFS
inode. Note that this makes the tracing macros rather misnamed, but this
kind of fallout will be fixed up incrementally later on.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29498a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:39:25 +10:00
Christoph Hellwig
b677c210ce [XFS] move v_iocount from bhv_vnode to xfs_inode
struct bhv_vnode is on it's way out, so move the I/O count to the XFS
inode.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29497a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:38:56 +10:00
Christoph Hellwig
09262b4339 [XFS] Create xfs_iflags_test_and_clear helper function
SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29496a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:38:36 +10:00
Christoph Hellwig
b3aea4edc2 [XFS] kill the v_flag member in struct bhv_vnode
All flags previously handled at the vnode level are not in the xfs_inode
where we already have a flags mechanisms and free bits for flags
previously in the vnode.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29495a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:37:29 +10:00
Christoph Hellwig
2f6f7b3d9b [XFS] kill v_vfsp member from struct bhv_vnode
We can easily get at the vfsp through the super_block but it will soon be
gone anyway.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29494a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 11:23:43 +10:00
Christoph Hellwig
739bfb2a7d [XFS] call common xfs vnode-level helpers directly and remove vnode operations
SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29493a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-16 10:40:00 +10:00
Christoph Hellwig
993386c19a [XFS] decontaminate vnode operations from behavior details
All vnode ops now take struct xfs_inode pointers and the behaviour related
glue is split out into methods of it's own. This required fixing
xfs_create/mkdir/symlink to not mess with the inode pointer but rather use
a separate boolean for error handling. Thanks to Dave Chinner for that
fix.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29492a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:54:29 +10:00
Vlad Apostolov
b93bd20cd5 [XFS] do not have XFSMNT_IDELETE as default when mounted with XFSMNT_DMAPI
XFS inodes are dynamically allocated on demand, rather than being
allocated at mkfs time. Chunks of 64 inodes are allocated at once, but
they are never freed. Over time, this can lead to filesystem
fragmentation, clusters of inodes and the btrees which point at them can
be scattered around the system.

By freeing clusters as they are emptied, we will reduce fragmentation of
the free space after removing files. This in turn will allow us to make
better placement decisions when repopulating a filesystem. The
XFSMNT_IDELETE mount option enables freeing clusters when they get empty.

Unfortunately a side effect of freeing inode clusters is that the inode
generation numbers of such inodes would be reset to zero when the cluster
is reclaimed. This is a problem in particular for a DMAPI enabled
filesystem as the the DMAPI handles need to be unique and persistent in
time. An unique DMAPI handle is built with the help of the inode
generation number. When the last one is prematurely reset by an inode
cluster reclaim, there is a high probability of different generation
inodes to end up having identical DMAPI handles.

To avoid the problem with identical DMAPI handles, the XFSMNT_IDELETE
mount option should be set as default, only if the filesystem is not
mounted with XFSMNT_DMAPI.

SGI-PV: 969192
SGI-Modid: xfs-linux-melb:xfs-kern:29486a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Mark Goodwin <markgw@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:53:36 +10:00
David Chinner
da353b0d64 [XFS] Radix tree based inode caching
One of the perpetual scaling problems XFS has is indexing it's incore
inodes. We currently uses hashes and the default hash sizes chosen can
only ever be a tradeoff between memory consumption and the maximum
realistic size of the cache.

As a result, anyone who has millions of inodes cached on a filesystem
needs to tunes the size of the cache via the ihashsize mount option to
allow decent scalability with inode cache operations.

A further problem is the separate inode cluster hash, whose size is based
on the ihashsize but is smaller, and so under certain conditions (sparse
cluster cache population) this can become a limitation long before the
inode hash is causing issues.

The following patchset removes the inode hash and cluster hash and
replaces them with radix trees to avoid the scalability limitations of the
hashes. It also reduces the size of the inodes by 3 pointers....

SGI-PV: 969561
SGI-Modid: xfs-linux-melb:xfs-kern:29481a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:50:50 +10:00
Christoph Hellwig
39cd9f877e [XFS] kill move.[ch]
Kill uio related functions and defines now that they're unused.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29480a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:50:26 +10:00
Christoph Hellwig
804c83c376 [XFS] stop using uio in the readlink code
Simplify the readlink code to get rid of the last user of uio.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29479a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:50:13 +10:00
Christoph Hellwig
051e7cd44a [XFS] use filldir internally
Currently xfs has a rather complicated internal scheme to allow for
different directory formats in IRIX. This patch rips all code related to
this out and pushes useage of the Linux filldir callback into the lowlevel
directory code. This does not make the code any less portable because
filldir can be used to create dirents of all possible variations
(including the IRIX ones as proved by the IRIX binary emulation code under
arch/mips/).

This patch get rid of an unessecary copy in the readdir path, about 400
lines of code and one of the last two users of the uio structure.

This version is updated to deal with dmapi aswell which greatly simplifies
the get_dirattrs code. The dmapi part has been tested using the
get_dirattrs tools from the xfstest dmapi suite1 with various small and
large directories.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29478a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:49:49 +10:00
Christoph Hellwig
2bdf7cd0ba [XFS] superblock endianess annotations
Creates a new xfs_dsb_t that is __be annotated and keeps xfs_sb_t for the
incore one. xfs_xlatesb is renamed to xfs_sb_to_disk and only handles the
incore -> disk conversion. A new helper xfs_sb_from_disk handles the other
direction and doesn't need the slightly hacky table-driven approach
because we only ever read the full sb from disk.

The handling of shared r/o filesystems has been buggy on little endian
system and fixing this required shuffling around of some code in that
area.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29477a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:49:09 +10:00
Christoph Hellwig
347d1c0195 [XFS] dinode endianess annotations
Biggest bit is duplicating the dinode structure so we have one annotated for
native endianess and one for disk endianess. The other significant change
is that xfs_xlate_dinode_core is split into one helper per direction to
allow for proper annotations, everything else is trivial.

As a sidenode splitting out the incore dinode means we can move it into
xfs_inode.h in a later patch and severely improving on the include hell in
xfs.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29476a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:48:30 +10:00
Michal Piotrowski
ddc6d3b32a [XFS] Fix build regression from mod/commit which did cleanup of xfs_bmbt_*set_allf
In sgi mod# xfs-linux-melb:xfs-kern:29319a, the variable renaming was not
complete and variable 'b' was left unchanged for non-lbd 32 bit machines.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29469a

Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:47:32 +10:00
Eric Sandeen
948c6d4fd8 [XFS] optimize dmapi event tests w/o dmapi config
SGI-PV: 969372
SGI-Modid: xfs-linux-melb:xfs-kern:29444a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:45:55 +10:00
Christoph Hellwig
eb9df39daf [XFS] remove unessecary vfs argument to DM_EVENT_ENABLED
SGI-PV: 968690
SGI-Modid: xfs-linux-melb:xfs-kern:29340a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:45:25 +10:00
Jesper Juhl
49ee6c911f [XFS] Fix a potential NULL pointer deref in XFS on failed mount.
If we fail to open the the log device buftarg, we can fall through to
error handling code that fails to check for a NULL log device buftarg
before calling xfs_free_buftarg().

This patch fixes the issue by checking mp->m_logdev_targp against NULL in
xfs_unmountfs_close() and doing the proper xfs_blkdev_put(logdev); and
xfs_blkdev_put(rtdev); on (!mp->m_rtdev_targp) in xfs_mount().

Discovered by the Coverity checker.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29328a

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:42:48 +10:00
Eric Sandeen
dcb3b83feb [XFS] clean up xfs_start_flags
xfs_start_flags can make use of is_power_of_2 to tidy up the test a little
bit.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29327a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:42:18 +10:00
Eric Sandeen
af3a2e8a3f [XFS] move linux/log2.h header to xfs_linux.h
Generally we try not to directly include linux header files in core xfs
code; xfs_linux.h is the spot for that.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29326a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:40:46 +10:00
Eric Sandeen
6385f4d557 [XFS] Remove xfs_physmem
Now that nobody's using it, remove xfs_physmem & friends.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29325a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:40:14 +10:00
Eric Sandeen
425f9ddd53 [XFS] Pick a single default inode cluster size.
Remove scaling of inode "clusters" based on machine memory; small cluster
cut-point was an unrealistic 32MB and was probably never tested.

Removes another user of xfs_physmem.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29324a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:39:35 +10:00
Eric Sandeen
1cb5125875 [XFS] choose single default logbuf count & size
Remove sizing of logbuf size & count based on physical memory; this was
never a very good gauge as it's looking at global memory, but deciding on
sizing per-filesystem; no account is made of the total number of
filesystems, for example.

For now just take the largest "default" case, as was set for machines with
>400MB - 8 x 32k buffers. This can always be tuned higher or lower with
mount options if necessary. Removes one more user of xfs_physmem.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29323a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:38:23 +10:00
Eric Sandeen
40906630f1 [XFS] Remove m_nreadaheads
m_nreadaheads in the mount struct is never used; remove it and the various
macros assigned to it. Also remove a couple other unused macros in the
same areas.

Removes one user of xfs_physmem.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29322a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:37:46 +10:00
Christoph Hellwig
cd8b0a97bd [XFS] endianess annotations for xfs_bmbt_rec_t
SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29321a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:26:44 +10:00
Christoph Hellwig
e05596643d [XFS] cleanup defintions of BMBT_*BITLEN macros
The BMBT_*BITLEN are currently defined in a complicated way depending on
XFS_NATIVE_HOST. But if all the macros are expanded they (obviously)
expand to the same value for both cases.

This patch defines the macros in the most simple way and updates the
comment describing them to remove outdated bits.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29320a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:26:31 +10:00
Christoph Hellwig
8cba43447e [XFS] clean up xfs_bmbt_set_all/xfs_bmbt_disk_set_all
xfs_bmbt_set_all/xfs_bmbt_disk_set_all are identical to
xfs_bmbt_set_allf/xfs_bmbt_disk_set_allf except that the former take a
xfs_bmbt_irec_t and the latter take the individual extent fields as scalar
values.

This patch reimplements xfs_bmbt_set_all/xfs_bmbt_disk_set_all as trivial
wrappers around xfs_bmbt_set_allf/xfs_bmbt_disk_set_allf and cleans up the
variable naming in xfs_bmbt_set_allf/xfs_bmbt_disk_set_allf to have some
meaning instead of one char variable names.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29319a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:26:13 +10:00
Christoph Hellwig
a6f64d4aea [XFS] split ondisk vs incore versions of xfs_bmbt_rec_t
currently xfs_bmbt_rec_t is used both for ondisk extents as well as
host-endian ones. This patch adds a new xfs_bmbt_rec_host_t for the native
endian ones and cleans up the fallout. There have been various endianess
issues in the tracing / debug printf code that are fixed by this patch.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29318a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:25:51 +10:00
Christoph Hellwig
d580ef6eaa [XFS] remove confusing INT_ comments in xfs_bmap_btree.c
SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29317a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:25:37 +10:00
Vlad Apostolov
3bacbcd883 [XFS] hole not shown when file is created with resvsp
SGI-PV: 967674
SGI-Modid: xfs-linux-melb:xfs-kern:29211a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:24:21 +10:00
David Chinner
0bfefc46dc [XFS] Barriers need to be dynamically checked and switched off
If the underlying block device suddenly stops supporting barriers, we need
to handle the -EOPNOTSUPP error in a sane manner rather than shutting
down the filesystem. If we get this error, clear the barrier flag, reissue
the I/O, and tell the world bad things are occurring.

SGI-PV: 964544
SGI-Modid: xfs-linux-melb:xfs-kern:28568a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-15 16:23:45 +10:00
Al Viro
782e3b3b38 Fix up more bio fallout
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-12 00:29:50 -07:00
NeilBrown
6712ecf8f6 Drop 'size' argument from bio_endio and bi_end_io
As bi_end_io is only called once when the reqeust is complete,
the 'size' argument is now redundant.  Remove it.

Now there is no need for bio_endio to subtract the size completed
from bi_size.  So don't do that either.

While we are at it, change bi_end_io to return void.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-10 09:25:57 +02:00
Tim Shimmin
564256c9e0 Revert "[XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer."
This reverts commit b394e43e99.

Lachlan McIlroy says:
    It tried to fix an issue where log replay is replaying an inode cluster
    initialisation transaction that should not be replayed because the inode
    cluster on disk is more up to date.  Since we don't log file sizes (we
    rely on inode flushing to get them to disk) then we can't just replay
    all the transations in the log and expect the inode to be completely
    restored.  We lose file size updates.  Unfortunately this fix is causing
    more (serious) problems than it is fixing.

SGI-PV: 969656
SGI-Modid: xfs-linux-melb:xfs-kern:29804a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-01 07:59:03 -07:00
Tim Shimmin
053c59a0a7 Revert "[XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer."
This reverts commit b394e43e99.

SGI-PV: 969656
SGI-Modid: xfs-linux-melb:xfs-kern:29804a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-10-01 16:39:37 +10:00
Christoph Hellwig
1bc5858d0d [XFS] fix valid but harmless sparse warning
The new xlog_recover_do_reg_buffer checks call be16_to_cpu on di_gen which
is a 32bit value so sparse rightly complains. Fortunately the warning is
harmless because we don't care for the value, but only whether it's
non-NULL. Due to that fact we can simply kill the endian swaps on this and
the previous di_mode check entirely.

SGI-PV: 969656
SGI-Modid: xfs-linux-melb:xfs-kern:29709a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-20 19:40:40 +10:00
Eric Sandeen
bcc7b445ef [XFS] fix filestreams on 32-bit boxes
xfs_filestream_mount() sets up an mru cache with:
  err = xfs_mru_cache_create(&mp->m_filestream, lifetime, grp_count,
  (xfs_mru_cache_free_func_t)xfs_fstrm_free_func);
but that cast is causing problems...
  typedef void (*xfs_mru_cache_free_func_t)(unsigned long, void*);
but:
  void xfs_fstrm_free_func( xfs_ino_t ino, fstrm_item_t *item)
so on a 32-bit box, it's casting (32, 32) args into (64, 32) and I assume
it's getting garbage for *item, which subsequently causes an explosion.
With this change the filestreams xfsqa tests don't oops on my 32-bit box.

SGI-PV: 967795
SGI-Modid: xfs-linux-melb:xfs-kern:29510a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-20 19:40:19 +10:00
Lachlan McIlroy
b394e43e99 [XFS] Avoid replaying inode buffer initialisation log items if on-disk version is newer.
SGI-PV: 969656
SGI-Modid: xfs-linux-melb:xfs-kern:29676a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-18 20:16:00 +10:00
Lachlan McIlroy
776a75fa5c [XFS] Ensure file size updates have been completed before writing inode to disk.
SGI-PV: 968767
SGI-Modid: xfs-linux-melb:xfs-kern:29675a

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-18 20:12:51 +10:00
David Chinner
65de556756 [XFS] On-demand reaping of the MRU cache
Instead of running the mru cache reaper all the time based on a timeout,
we should only run it when the cache has active objects. This allows CPUs
to sleep when there is no activity rather than be woken repeatedly just to
check if there is anything to do.

SGI-PV: 968554
SGI-Modid: xfs-linux-melb:xfs-kern:29305a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-17 16:42:02 +10:00
Eric Sandeen
5995cb7d80 [XFS] fix nasty quota hashtable allocation bug
This git mod: 77e4635ae1
converted to a "greedy" allocation interface, but for the quota hashtables
it switched from allocating XFS_QM_HASHSIZE (nr of elements)
xfs_dqhash_t's to allocating only XFS_QM_HASHSIZE *bytes* - quite a lot
smaller! Then when we converted hsize "back" to nr of elements (the
division line) hsize went to 0. This was leading to oopses when running
any quota tests on the Fedora 8 test kernel, but the problem has been
there for almost a year.

SGI-PV: 968837
SGI-Modid: xfs-linux-melb:xfs-kern:29354a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-05 14:51:04 +10:00
Christoph Hellwig
265c1fac38 [XFS] fix sparse shadowed variable warnings
- in xfs_probe_cluster rename the inner len to pg_len. There's no harm
  here because the outer len isn't used after the inner len comes into
  existence but it keeps the code clean.
- in xfs_da_do_buf remove the inner i because they don't overlap
  and they are both the same type.

SGI-PV: 968555
SGI-Modid: xfs-linux-melb:xfs-kern:29311a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-05 14:50:26 +10:00
Christoph Hellwig
ee5c80239d [XFS] fix ASSERT and ASSERT_ALWAYS
- remove the != 0 inside the unlikely in ASSERT_ALWAYS because sparse now
  complains about comparisons between pointers and 0
- add a standalone ASSERT implementation because defining it to
  ASSERT_ALWAYS means the string is expanded before the token passing
  stringification. This way we get the actual content of the
  assertion in the assfail message and don't overflow sparse's
  stringification buffer leading to sparse error messages.

SGI-PV: 968555
SGI-Modid: xfs-linux-melb:xfs-kern:29310a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-05 14:49:30 +10:00
Christoph Hellwig
34521c5e49 [XFS] Fix sparse warning in kmem_shake_allow
We can't return a masked result of a __bitwise type. Compare it to 0 first
to keep the behaviour without the warning.

SGI-PV: 968555
SGI-Modid: xfs-linux-melb:xfs-kern:29309a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-05 14:48:00 +10:00
Christoph Hellwig
4b80916b29 [XFS] Fix sparse NULL vs 0 warnings
Sparse now warns about comparing pointers to 0, so change all instance
where that happens to NULL instead.

SGI-PV: 968555
SGI-Modid: xfs-linux-melb:xfs-kern:29308a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-05 14:47:33 +10:00
David Chinner
8da22d7a36 [XFS] Set filestreams object timeout to something sane.
SGI-PV: 968554
SGI-Modid: xfs-linux-melb:xfs-kern:29303a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-09-05 14:47:10 +10:00
Al Viro
ad690ef9e6 xfs ioctl __user annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-26 11:11:57 -07:00
Paul Mundt
20c2df83d2 mm: Remove slab destructors from kmem_cache_create().
Slab destructors were no longer supported after Christoph's
c59def9f22 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2007-07-20 10:11:58 +09:00
Linus Torvalds
fdb64f93b3 Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
  [XFS] Fix inode size update before data write in xfs_setattr
  [XFS] Allow punching holes to free space when at ENOSPC
  [XFS] Implement ->page_mkwrite in XFS.
  [FS] Implement block_page_mkwrite.

Manually fix up conflict with Nick's VM fault handling patches in
fs/xfs/linux-2.6/xfs_file.c

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 14:41:33 -07:00
Nick Piggin
d0217ac04c mm: fault feedback #1
Change ->fault prototype.  We now return an int, which contains
VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
 FAULT_RET_ code tells the VM whether a page was found, whether it has been
locked, and potentially other things.  This is not quite the way he wanted
it yet, but that's changed in the next patch (which requires changes to
arch code).

This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
that a page is locked which requires filemap_nopage to go away (because we
can no longer remain backward compatible without that flag), but we were
going to do that anyway.

struct fault_data is renamed to struct vm_fault as Linus asked. address
is now a void __user * that we should firmly encourage drivers not to use
without really good reason.

The page is now returned via a page pointer in the vm_fault struct.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:41 -07:00
Nick Piggin
54cb8821de mm: merge populate and nopage into fault (fixes nonlinear)
Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
the virtual address -> file offset differently from linear mappings.

->populate is a layering violation because the filesystem/pagecache code
should need to know anything about the virtual memory mapping.  The hitch here
is that the ->nopage handler didn't pass down enough information (ie.  pgoff).
 But it is more logical to pass pgoff rather than have the ->nopage function
calculate it itself anyway (because that's a similar layering violation).

Having the populate handler install the pte itself is likewise a nasty thing
to be doing.

This patch introduces a new fault handler that replaces ->nopage and
->populate and (later) ->nopfn.  Most of the old mechanism is still in place
so there is a lot of duplication and nice cleanups that can be removed if
everyone switches over.

The rationale for doing this in the first place is that nonlinear mappings are
subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
to duplicate the synchronisation logic rather than just consolidate the two.

After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
pagecache.  Seems like a fringe functionality anyway.

NOPAGE_REFAULT is removed.  This should be implemented with ->fault, and no
users have hit mainline yet.

[akpm@linux-foundation.org: cleanup]
[randy.dunlap@oracle.com: doc. fixes for readahead]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:41 -07:00
Nick Piggin
d00806b183 mm: fix fault vs invalidate race for linear mappings
Fix the race between invalidate_inode_pages and do_no_page.

Andrea Arcangeli identified a subtle race between invalidation of pages from
pagecache with userspace mappings, and do_no_page.

The issue is that invalidation has to shoot down all mappings to the page,
before it can be discarded from the pagecache.  Between shooting down ptes to
a particular page, and actually dropping the struct page from the pagecache,
do_no_page from any process might fault on that page and establish a new
mapping to the page just before it gets discarded from the pagecache.

The most common case where such invalidation is used is in file truncation.
This case was catered for by doing a sort of open-coded seqlock between the
file's i_size, and its truncate_count.

Truncation will decrease i_size, then increment truncate_count before
unmapping userspace pages; do_no_page will read truncate_count, then find the
page if it is within i_size, and then check truncate_count under the page
table lock and back out and retry if it had subsequently been changed (ptl
will serialise against unmapping, and ensure a potentially updated
truncate_count is actually visible).

Complexity and documentation issues aside, the locking protocol fails in the
case where we would like to invalidate pagecache inside i_size.  do_no_page
can come in anytime and filemap_nopage is not aware of the invalidation in
progress (as it is when it is outside i_size).  The end result is that
dangling (->mapping == NULL) pages that appear to be from a particular file
may be mapped into userspace with nonsense data.  Valid mappings to the same
place will see a different page.

Andrea implemented two working fixes, one using a real seqlock, another using
a page->flags bit.  He also proposed using the page lock in do_no_page, but
that was initially considered too heavyweight.  However, it is not a global or
per-file lock, and the page cacheline is modified in do_no_page to increment
_count and _mapcount anyway, so a further modification should not be a large
performance hit.  Scalability is not an issue.

This patch implements this latter approach.  ->nopage implementations return
with the page locked if it is possible for their underlying file to be
invalidated (in that case, they must set a special vm_flags bit to indicate
so).  do_no_page only unlocks the page after setting up the mapping
completely.  invalidation is excluded because it holds the page lock during
invalidation of each page (and ensures that the page is not mapped while
holding the lock).

This also allows significant simplifications in do_no_page, because we have
the page locked in the right place in the pagecache from the start.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:41 -07:00
David Chinner
c32676eea1 [XFS] Fix inode size update before data write in xfs_setattr
When changing the file size by a truncate() call, we log the change in the
inode size. However, we do not flush any outstanding data that might not
have been written to disk, thereby violating the data/inode size update
order. This can leave files full of NULLs on crash.

Hence if we are truncating the file, flush any unwritten data that may lie
between the curret on disk inode size and the new inode size that is being
logged to ensure that ordering is preserved.

SGI-PV: 966308
SGI-Modid: xfs-linux-melb:xfs-kern:29174a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-19 19:52:05 +10:00
David Chinner
91ebecc74e [XFS] Allow punching holes to free space when at ENOSPC
Make the free file space transaction able to dip into the reserved blocks
to ensure that we can successfully free blocks when the filesystem is at
ENOSPC.

SGI-PV: 967788
SGI-Modid: xfs-linux-melb:xfs-kern:29167a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-19 19:51:46 +10:00
David Chinner
4f57dbc6b5 [XFS] Implement ->page_mkwrite in XFS.
Hook XFS up to ->page_mkwrite to ensure that we know about mmap pages
being written to. This allows use to do correct delayed allocation and
ENOSPC checking as well as remap unwritten extents so that they get
converted correctly during writeback. This is done via the generic
block_page_mkwrite code.

SGI-PV: 940392
SGI-Modid: xfs-linux-melb:xfs-kern:29149a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-19 19:51:21 +10:00
Christoph Hellwig
a569425512 knfsd: exportfs: add exportfs.h header
currently the export_operation structure and helpers related to it are in
fs.h.  fs.h is already far too large and there are very few places needing the
export bits, so split them off into a separate header.

[akpm@linux-foundation.org: fix cifs build]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:06 -07:00
Rafael J. Wysocki
8314418629 Freezer: make kernel threads nonfreezable by default
Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves.  This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.

It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.

The patch causes all kernel threads to be nonfreezable by default (ie.  to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE.  It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear.  Additionally, it updates documentation to
describe the freezing of tasks more accurately.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
Rusty Russell
8e1f936b73 mm: clean up and kernelify shrinker registration
I can never remember what the function to register to receive VM pressure
is called.  I have to trace down from __alloc_pages() to find it.

It's called "set_shrinker()", and it needs Your Help.

1) Don't hide struct shrinker.  It contains no magic.
2) Don't allocate "struct shrinker".  It's not helpful.
3) Call them "register_shrinker" and "unregister_shrinker".
4) Call the function "shrink" not "shrinker".
5) Reduce the 17 lines of waffly comments to 13, but document it properly.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Chinner <dgc@sgi.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:00 -07:00
David Chinner
0f1145cc18 [XFS] Fix lockdep annotations for xfs_lock_inodes
SGI-PV: 967035
SGI-Modid: xfs-linux-melb:xfs-kern:29026a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
2007-07-14 18:09:42 +10:00