Commit Graph

668 Commits

Author SHA1 Message Date
Trond Myklebust
6ecc5e8fca NFS: Fix dcache revalidation bugs
We don't need to force a dentry lookup just because we're making changes to
the directory.

Don't update nfsi->cache_change_attribute in nfs_end_data_update: that
overrides the NFSv3/v4 weak consistency checking that tells us our update
was the only one, and that tells us the dcache is still valid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:49 -04:00
Trond Myklebust
7957c1418f NFS: fix nfs_verify_change_attribute
We always want to check that the verifier and directory
cache_change_attribute match. This also allows us to remove the 'wraparound
hack' for the cache_change_attribute. If we're only checking for equality,
then we don't care about wraparound issues.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:46 -04:00
Trond Myklebust
68e8a70d3c NFS: nfs_post_op_update_inode() should call nfs_refresh_inode()
Ensure that we don't clobber the results from a more recent getattr call...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:43 -04:00
Trond Myklebust
f2115dc987 NFS: Fix over-conservative attribute invalidation in nfs_update_inode()
We should always be declaring the attribute cache as valid after having
updated it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:40 -04:00
Trond Myklebust
76b32999df NFSv4: Make NFSv4 ACCESS calls return attributes too...
It doesn't really make sense to cache an access call without also
revalidating the attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:38 -04:00
Trond Myklebust
af22f94ae0 NFSv4: Simplify _nfs4_do_access()
Currently, _nfs4_do_access() is just a copy of nfs_do_access() with added
conversion of the open flags into an access mask. This patch merges the
duplicate functionality.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:34 -04:00
Trond Myklebust
cd3758e37d NFS: Replace file->private_data with calls to nfs_file_open_context()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:31 -04:00
Chuck Lever
8fb559f87f NFS: Eliminate nfs_refresh_verifier()
nfs_set_verifier() and nfs_refresh_verifier() do exactly the same thing, so
replace one with the other.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:26 -04:00
Chuck Lever
77a55a1fe8 NFS: Eliminate nfs_renew_times()
The nfs_renew_times() function plants the current time in jiffies in
dentry->d_time.  But a call to nfs_renew_times() is always followed by
another call that overwrites dentry->d_time.  Get rid of the
nfs_renew_times() calls.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:24 -04:00
Chuck Lever
92f6c17825 NFS: Don't call nfs_renew_times() in nfs_dentry_iput()
Negative dentries need to be reverified after an asynchronous unlink.

Quoth Trond:

"Unfortunately I don't think that we can avoid revalidating the
resulting negative dentry since the UNLINK call is asynchronous,
and so the new verifier on the directory will only be known a
posteriori."

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:22 -04:00
Chuck Lever
bcf35617a7 NFS: Show "nointr" mount option
The default "intr" setting is different for NFS and NFSv4.  To avoid
confusion on this issue, don't hide the "nointr" option in /proc/mounts.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:17 -04:00
Chuck Lever
6e88e0618c NFS: Verify server address before invoking in-kernel mount client
Re-order mount option sanity checking slightly to ensure we have a valid
server address *before* trying to do the mountd RPC call.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:14 -04:00
\"Talpey, Thomas\
2cf7ff7a37 NFS: support RDMA mounts
Adds hooks to the string-based NFS mount to support an "rdma" protocol option.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:00 -04:00
\"Talpey, Thomas\
56928edd5a NFS - print accurate transport protocol
Use the per-transport strings to display the transport protocol accurately.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:55 -04:00
\"Talpey, Thomas\
0896a725a1 NFS/SUNRPC: use transport protocol naming
Instead of an { address family, raw IP protocol number }-tuple, use the
newly-defined RPC identifier when creating clients in the upper layers.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:53 -04:00
\"Talpey, Thomas\
4f22ccc346 SUNRPC: mark bulk read/write data in xdrbuf
Adds a flag word to the xdrbuf struct which indicates any bulk
disposition of the data. This enables RPC transport providers to
marshal it efficiently/appropriately, and may enable other
optimizations.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:34 -04:00
Trond Myklebust
20c71f5e0f NFSv4: Fix a bug in nfs4_validate_mount_data()
The previous patch introduced a bug when copying the server address.

Also clarify a copy into the auth_flavours array: currently the two
size calculations are equivalent, but we may decide to change the size
of auth_flavors[] at some point.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:31 -04:00
\"Talpey, Thomas\
91ea40b9c6 NFS: use in-kernel mount argument structure for nfsv4 mounts
The user-visible nfs4_mount_data does not contain sufficient data to
describe new mount options, and also is now a legacy structure. Replace
it with the internal nfs_parsed_mount_data for nfsv4 in-kernel use.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:28 -04:00
\"Talpey, Thomas\
2283f8d6ed NFS: use in-kernel mount argument structure for nfsv[23] mounts
The user-visible nfs_mount_data does not contain sufficient data to
describe new mount options, and also is now a legacy structure. Replace
it with the internal nfs_parsed_mount_data for nfsv[23] in-kernel use.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:26 -04:00
\"Talpey, Thomas\
6b18eaa082 NFS: move nfs_parsed_mount_data structure definition
In preparation for rearranging the nfs mount argument passing, make the
nfs_parsed_mount_data struct visible across nfs kernel files.

Signed-off-by: Tom Talpey <tmt@netapp.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:23 -04:00
Chuck Lever
fe82a183ca NFS: Convert printk's to dprintk's in fs/nfs/nfs?xdr.c
Due to recent edict to replace or remove printk's that can be triggered en
masse by remote misbehavior.  Left a few that only occur just before a BUG.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:09 -04:00
Chuck Lever
0ac83779fa NFS: Add new 'mountaddr=' mount option
I got the 'mounthost=' option wrong - it shouldn't look for an address
value, but rather a hostname value.  However, the in-kernel mount client
and NFS client cannot resolve a hostname by themselves; they rely on
user-land to pass in the resolved address.

Create a new mount option that does take an address so that the mount
program's address can be passed in.  The mount hostname is now ignored
by the kernel.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:06 -04:00
James Lentini
aad7000735 [NFS] [PATCH] NFS: initialize default port in kernel mount client
If no mount server port number is specified, the previous change to the
kernel mount client inadvertently allows the NFS server's port number to be
the used as the mount server's port number. If the user specifies an NFS
server port (-o port=x), the mount will fail.

The fix below sets the mount server's port to 0 if no mount server
port is specified by the user.

Signed-off-by: James Lentini <jlentini@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:04 -04:00
Chuck Lever
efd8340bb1 NFS: Kernel mount client should use async bind
Simplify the in-kernel mount client by using autobind instead of an
explicit call to rpc_getport_sync.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:17:01 -04:00
Jeff Layton
ddc01c0813 [NFS] [PATCH] NFS: show addr=ipaddr in /proc/mounts rather than
A minor thing, but useful when working with a server with multiple
addrs. This looks like it might also be necessary if Miklos' effort
to eliminate /etc/mtab ever comes to fruition.

When displaying mount options in /proc/mounts, the kernel prints
"addr=hostname". This info is redundant since we already have the
hostname displayed as part of the "device" section of the mount. This
patch changes it to display the IP address to which the socket is
connected.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:39 -04:00
Christoph Hellwig
f8cf3678f4 [NFS] [PATCH] nfs: tiny makefile cleanup
no need to set up foo-objs these days.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:36 -04:00
Fabio Olive Leite
c7e1596111 Re: [NFS] [PATCH] Attribute timeout handling and wrapping u32 jiffies
I would like to discuss the idea that the current checks for attribute
timeout using time_after are inadequate for 32bit architectures, since
time_after works correctly only when the two timestamps being compared
are within 2^31 jiffies of each other. The signed overflow caused by
comparing values more than 2^31 jiffies apart will flip the result,
causing incorrect assumptions of validity.

2^31 jiffies is a fairly large period of time (~25 days) when compared
to the lifetime of most kernel data structures, but for long lived NFS
mounts that can sit idle for months (think that for some reason autofs
cannot be used), it is easy to compare inode attribute timestamps with
very disparate or even bogus values (as in when jiffies have wrapped
many times, where the comparison doesn't even make sense).

Currently the code tests for attribute timeout by simply adding the
desired amount of jiffies to the stored timestamp and comparing that
with the current timestamp of obtained attribute data with time_after.
This is incorrect, as it returns true for the desired timeout period
and another full 2^31 range of jiffies.

In testing with artificial jumps (several small jumps, not one big
crank) of the jiffies I was able to reproduce a problem found in a
server with very long lived NFS mounts, where attributes would not be
refreshed even after touching files and directories in the server:

Initial uptime:
03:42:01 up 6 min, 0 users, load average: 0.01, 0.12, 0.07

NFS volume is mounted and time is advanced:
03:38:09 up 25 days, 2 min, 0 users, load average: 1.22, 1.05, 1.08

# ls -l /local/A/foo/bar /nfs/A/foo/bar
-rw-r--r--  1 root root 0 Dec 17 03:38 /local/A/foo/bar
-rw-r--r--  1 root root 0 Nov 22 00:36 /nfs/A/foo/bar

# touch /local/A/foo/bar

# ls -l /local/A/foo/bar /nfs/A/foo/bar
-rw-r--r--  1 root root 0 Dec 17 03:47 /local/A/foo/bar
-rw-r--r--  1 root root 0 Nov 22 00:36 /nfs/A/foo/bar

We can see the local mtime is updated, but the NFS mount still shows
the old value. The patch below makes it work:

Initial setup...
07:11:02 up 25 days, 1 min,  0 users,  load average: 0.15, 0.03, 0.04

# ls -l /local/A/foo/bar /nfs/A/foo/bar
-rw-r--r--  1 root root 0 Jan 11 07:11 /local/A/foo/bar
-rw-r--r--  1 root root 0 Jan 11 07:11 /nfs/A/foo/bar

# touch /local/A/foo/bar

# ls -l /local/A/foo/bar /nfs/A/foo/bar
-rw-r--r--  1 root root 0 Jan 11 07:14 /local/A/foo/bar
-rw-r--r--  1 root root 0 Jan 11 07:14 /nfs/A/foo/bar

Signed-off-by: Fabio Olive Leite <fleite@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:33 -04:00
Peter Staubach
4e769b934e 64 bit ino support for NFS client
Hi.

Attached is a patch to modify the NFS client code to support
64 bit ino's, as appropriate for the system and the NFS
protocol version.

The code basically just expand the NFS interfaces for routines
which handle ino's from using ino_t to u64 and then uses the
fileid in the nfs_inode instead of i_ino in the inode.  The
code paths that were updated are in the getattr method and
the readdir methods.

This should be no real change on 64 bit platforms.  Since
the ino_t is an unsigned long, it would already be 64 bits
wide.

    Thanx...

           ps

Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:29 -04:00
Trond Myklebust
7b159fc18d NFS: Fall back to synchronous writes when a background write errors...
This helps prevent huge queues of background writes from building up
whenever the server runs out of disk or quota space, or if someone changes
the file access modes behind our backs.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:23 -04:00
Trond Myklebust
34901f70d1 NFS: Writeback optimisation
Schedule writes using WB_SYNC_NONE first, then come back for a second pass
using WB_SYNC_ALL.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:21 -04:00
Trond Myklebust
ed90ef51a3 NFS: Clean up NFS writeback flush code
The only user of nfs_sync_mapping_range() is nfs_getattr(), which uses it
to flush out the entire inode without sending a commit. We therefore
replace nfs_sync_mapping_range with a more appropriate helper.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:18 -04:00
Trond Myklebust
f758c88519 NFS: Clean up nfs_writepages()
Just call write_cache_pages directly instead of hacking the writeback
control structure in order to find out if we were called from writepages()
or directly from the VM.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:13 -04:00
Trond Myklebust
9cccef9505 NFS: Clean up write code...
The addition of nfs_page_mkwrite means that We should no longer need to
create requests inside nfs_writepage()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:11 -04:00
Trond Myklebust
94387fb1aa NFS: Add the helper nfs_vm_page_mkwrite
This is needed in order to set up a proper nfs_page request for mmapped
files.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:08 -04:00
Trond Myklebust
54af3bb543 NFS: Fix an Oops in encode_lookup()
It doesn't look as if the NFS file name limit is being initialised correctly
in the struct nfs_server. Make sure that we limit whatever is being set in
nfs_probe_fsinfo() and nfs_init_server().

Also ensure that readdirplus and nfs4_path_walk respect our file name
limits.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-28 15:36:42 -07:00
Alexey Dobriyan
49af7ee181 nfs: fix oops re sysctls and V4 support
NFS unregisters sysctls only if V4 support is compiled in.  However, sysctl
table is not V4 specific, so unregister it always.

Steps to reproduce:

	[build nfs.ko with CONFIG_NFS_V4=n]
	modrobe nfs
	rmmod nfs
	ls /proc/sys

Unable to handle kernel paging request at ffffffff880661c0 RIP:
 [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
PGD 203067 PUD 207063 PMD 7e216067 PTE 0
Oops: 0000 [1] SMP
CPU 1
Modules linked in: lockd nfs_acl sunrpc
Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2
RIP: 0010:[<ffffffff802af8e3>]  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
RSP: 0018:ffff81007fd93e78  EFLAGS: 00010286
RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0
RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40
RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0
R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280
FS:  00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000)
Stack:  ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40
 ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a
 2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640
Call Trace:
 [<ffffffff80283f30>] filldir+0x0/0xf0
 [<ffffffff80283f30>] filldir+0x0/0xf0
 [<ffffffff802840c7>] vfs_readdir+0xa7/0xc0
 [<ffffffff80284376>] sys_getdents+0x96/0xe0
 [<ffffffff8020bb3e>] system_call+0x7e/0x83

Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b
RIP  [<ffffffff802af8e3>] proc_sys_readdir+0xd3/0x350
 RSP <ffff81007fd93e78>
CR2: ffffffff880661c0
Kernel panic - not syncing: Fatal exception

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-09-19 11:24:18 -07:00
Trond Myklebust
1b3b4a1a2d NFS: Fix a write request leak in nfs_invalidate_page()
Ryusuke Konishi says:

The recent truncate_complete_page() clears the dirty flag from a page
before calling a_ops->invalidatepage(),
^^^^^^
static void
truncate_complete_page(struct address_space *mapping, struct page *page)
{
        ...
        cancel_dirty_page(page, PAGE_CACHE_SIZE);  <--- Inserted here at
kernel 2.6.20

        if (PagePrivate(page))
                do_invalidatepage(page, 0);   ---> will call
a_ops->invalidatepage()
        ...
}

and this is disturbing nfs_wb_page_priority() from calling 
nfs_writepage_locked() that is expected to handle the pending
request (=nfs_page) associated with the page.

int nfs_wb_page_priority(struct inode *inode, struct page *page, int how)
{
        ...
        if (clear_page_dirty_for_io(page)) {
                ret = nfs_writepage_locked(page, &wbc);
                if (ret < 0)
                        goto out;
        }
        ...
}

Since truncate_complete_page() will get rid of the page after
a_ops->invalidatepage() returns, the request (=nfs_page) associated
with the page becomes a garbage in nfs_inode->nfs_page_tree.
------------------------

Fix this by ensuring that nfs_wb_page_priority() recognises that it may
also need to clear out non-dirty pages that have an nfs_page associated
with them.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:54 -04:00
Chuck Lever
7d1cca7299 NFS: change NFS mount error return when hostname/pathname too long
According to the mount(2) man page, the proper error return code for the
mount(2) system call when the special device name or the mounted-on
directory name is too long is ENAMETOOLONG.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:40 -04:00
Chuck Lever
350c73af6a NFS: Off-by-one length error in string handling
The hostname was getting truncated in the new text-based NFS mount API.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:40 -04:00
Chuck Lever
fdc6e2c8c0 NFS: Return a real error code from mount(2)
Don't filter the return code from the in-kernel rpcbind or NFS mount
clients.  Return the real error code so that callers of the new NFS
text-based mount API can apply a useful retry strategy.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:39 -04:00
Chuck Lever
fdb66ff4ac NFS: mount option parser chokes on proto=
The new text-based NFS mount option parsing logic doesn't recognize any
valid transport protocols due to a silly mistake in the protocol token
matching logic.  This prevents basic mount requests such as:

   mount.nfs server:/export /mnt -o proto=tcp

from working with the new text-based NFS mount API.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:38 -04:00
Trond Myklebust
deee9369b9 NFSv4: Ensure that we pass the correct dentry to nfs4_intent_set_file
This patch fixes an Oops that was reported by Gabriel Barazer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:38 -04:00
Trond Myklebust
65bbf6bdbb NFSv4: Fix a typo in _nfs4_do_open_reclaim
This should fix the following Oops reported by Jeff Garzik:

kernel BUG at fs/nfs/nfs4xdr.c:1040!
invalid opcode: 0000 [1] SMP 
CPU 0 
Modules linked in: nfs lockd sunrpc af_packet
ipv6 cpufreq_ondemand acpi_cpufreq battery floppy nvram sg snd_hda_intel
ata_generic snd_pcm_oss snd_mixer_oss snd_pcm i2c_i801 snd_page_alloc e1000
firewire_ohci ata_piix i2c_core sr_mod cdrom sata_sil ahci libata sd_mod
scsi_mod ext3 jbd ehci_hcd uhci_hcd
Pid: 16353, comm: 10.10.10.1-recl Not tainted 2.6.23-rc3 #1
RIP: 0010:[<ffffffff88240980>] [<ffffffff88240980>] :nfs:encode_open+0x1c0/0x330
RSP: 0018:ffff8100467c5c60  EFLAGS: 00010202
RAX: ffff81000f89b8b8 RBX: 00000000697a6f6d RCX: ffff81000f89b8b8
RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8100467c5c80
RBP: ffff8100467c5c80 R08: ffff81000f89bc30 R09: ffff81000f89b83f
R10: 0000000000000001 R11: ffffffff881e79e0 R12: ffff81003cbd1808
R13: ffff81000f89b860 R14: ffff81005fc984e0 R15: ffffffff88240af0
FS:  0000000000000000(0000) GS:ffffffff8052a000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002adb9e51a030 CR3: 000000007ea7e000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process 10.10.10.1-recl (pid: 16353, threadinfo ffff8100467c4000, task ffff8100038ce780)
Stack:  ffff81004aeb6a40 ffff81003cbd1808 ffff81003cbd1808 ffffffff88240b5d
 ffff81000f89b8bc ffff81005fc984e8 ffff81000f89bc30 ffff81005fc984e8
 0000000300000000 0000000000000000 0000000000000000 ffff81003cbd1800
Call Trace:
 [<ffffffff88240b5d>] :nfs:nfs4_xdr_enc_open_noattr+0x6d/0x90
 [<ffffffff881e74b7>] :sunrpc:rpcauth_wrap_req+0x97/0xf0
 [<ffffffff88240af0>] :nfs:nfs4_xdr_enc_open_noattr+0x0/0x90
 [<ffffffff881df57a>] :sunrpc:call_transmit+0x18a/0x290
 [<ffffffff881e5e7b>] :sunrpc:__rpc_execute+0x6b/0x290
 [<ffffffff881dff76>] :sunrpc:rpc_do_run_task+0x76/0xd0
 [<ffffffff882373f6>] :nfs:_nfs4_proc_open+0x76/0x230
 [<ffffffff88237a2e>] :nfs:nfs4_open_recover_helper+0x5e/0xc0
 [<ffffffff88237b74>] :nfs:nfs4_open_recover+0xe4/0x120
 [<ffffffff88238e14>] :nfs:nfs4_open_reclaim+0xa4/0xf0
 [<ffffffff882413c5>] :nfs:nfs4_reclaim_open_state+0x55/0x1b0
 [<ffffffff882417ea>] :nfs:reclaimer+0x2ca/0x390
 [<ffffffff88241520>] :nfs:reclaimer+0x0/0x390
 [<ffffffff8024e59b>] kthread+0x4b/0x80
 [<ffffffff8020cad8>] child_rip+0xa/0x12
 [<ffffffff8024e550>] kthread+0x0/0x80
 [<ffffffff8020cace>] child_rip+0x0/0x12


Code: 0f 0b eb fe 48 89 ef c7 00 00 00 00 02 be 08 00 00 00 e8 79 
RIP  [<ffffffff88240980>] :nfs:encode_open+0x1c0/0x330
 RSP <ffff8100467c5c60>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:37 -04:00
Trond Myklebust
560aef7450 NFS: Fix use of cancel_delayed_work_sync in nfs_release_automount_timer
Doh! We can't use cancel_delayed_work_sync because we may have been called
from an unmount that was being performed by nfs_automount_task.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:36 -04:00
Trond Myklebust
e89a5a43b9 NFS: Fix the mount regression
This avoids the recent NFS mount regression (returning EBUSY when
mounting the same filesystem twice with different parameters).

The best I can do given the constraints appears to be to have the kernel
first look for a superblock that matches both the fsid and the
user-specified mount options, and then spawn off a new superblock if
that search fails.

Note that this is not the same as specifying nosharecache everywhere
since nosharecache will never attempt to match an existing superblock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-08-31 20:26:45 -07:00
Trond Myklebust
3d39c691ff NFS: Replace flush_scheduled_work with cancel_work_sync() and friends
This will avoid deadlocks of the form:

stack backtrace:
 [<c0104fda>] show_trace_log_lvl+0x1a/0x30
 [<c0105c02>] show_trace+0x12/0x20
 [<c0105d15>] dump_stack+0x15/0x20
 [<c013ee42>] __lock_acquire+0xc22/0x1030
 [<c013f2b1>] lock_acquire+0x61/0x80
 [<c012edd9>] flush_workqueue+0x49/0x70
 [<c012ee0d>] flush_scheduled_work+0xd/0x10
 [<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs]
 [<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs]
 [<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs]
 [<c017b38d>] deactivate_super+0x7d/0xa0
 [<c018f94b>] mntput_no_expire+0x4b/0x80
 [<c018fd94>] expire_mount_list+0xe4/0x140
 [<c0191219>] mark_mounts_for_expiry+0x99/0xb0
 [<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs]
 [<c012e61b>] run_workqueue+0x12b/0x1e0
 [<c012f05b>] worker_thread+0x9b/0x100
 [<c0131c72>] kthread+0x42/0x70
 [<c0104c0f>] kernel_thread_helper+0x7/0x18
 =======================

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 16:12:50 -04:00
Trond Myklebust
905f8d16e3 NFSv4: Don't call put_rpccred() from an rcu callback
Doing so would require us to introduce bh-safe locks into put_rpccred().
This patch fixes the lockdep complaint reported by Marc Dietrich:

inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
 (rpc_credcache_lock){-+..}, at: [<c01dc487>]
_atomic_dec_and_lock+0x17/0x60
{softirq-on-W} state was registered at:
  [<c013e870>] __lock_acquire+0x650/0x1030
  [<c013f2b1>] lock_acquire+0x61/0x80
  [<c02db9ac>] _spin_lock+0x2c/0x40
  [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
  [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
  [<dced56c1>] rpcauth_unbindcred+0x21/0x60 [sunrpc]
  [<dced3fd4>] a0 [sunrpc]
  [<dcecefe0>] rpc_call_sync+0x30/0x40 [sunrpc]
  [<dcedc73b>] rpcb_register+0xdb/0x180 [sunrpc]
  [<dced65b3>] svc_register+0x93/0x160 [sunrpc]
  [<dced6ebe>] __svc_create+0x1ee/0x220 [sunrpc]
  [<dced7053>] svc_create+0x13/0x20 [sunrpc]
  [<dcf6d722>] nfs_callback_up+0x82/0x120 [nfs]
  [<dcf48f36>] nfs_get_client+0x176/0x390 [nfs]
  [<dcf49181>] nfs4_set_client+0x31/0x190 [nfs]
  [<dcf49983>] nfs4_create_server+0x63/0x3b0 [nfs]
  [<dcf52426>] nfs4_get_sb+0x346/0x5b0 [nfs]
  [<c017b444>] vfs_kern_mount+0x94/0x110
  [<c0190a62>] do_mount+0x1f2/0x7d0
  [<c01910a6>] sys_mount+0x66/0xa0
  [<c0104046>] syscall_call+0x7/0xb
  [<ffffffff>] 0xffffffff
irq event stamp: 5277830
hardirqs last  enabled at (5277830): [<c017530a>] kmem_cache_free+0x8a/0xc0
hardirqs last disabled at (5277829): [<c01752d2>] kmem_cache_free+0x52/0xc0
softirqs last  enabled at (5277798): [<c0124173>] __do_softirq+0xa3/0xc0
softirqs last disabled at (5277817): [<c01241d7>] do_softirq+0x47/0x50

other info that might help us debug this:
no locks held by swapper/0.

stack backtrace:
 [<c0104fda>] show_trace_log_lvl+0x1a/0x30
 [<c0105c02>] show_trace+0x12/0x20
 [<c0105d15>] dump_stack+0x15/0x20
 [<c013ccc3>] print_usage_bug+0x153/0x160
 [<c013d8b9>] mark_lock+0x449/0x620
 [<c013e824>] __lock_acquire+0x604/0x1030
 [<c013f2b1>] lock_acquire+0x61/0x80
 [<c02db9ac>] _spin_lock+0x2c/0x40
 [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
 [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
 [<dcf6bf83>] nfs_free_delegation_callback+0x13/0x20 [nfs]
 [<c012f9ea>] __rcu_process_callbacks+0x6a/0x1c0
 [<c012fb52>] rcu_process_callbacks+0x12/0x30
 [<c0124218>] tasklet_action+0x38/0x80
 [<c0124125>] __do_softirq+0x55/0xc0
 [<c01241d7>] do_softirq+0x47/0x50
 [<c0124605>] irq_exit+0x35/0x40
 [<c0112463>] smp_apic_timer_interrupt+0x43/0x80
 [<c0104a77>] apic_timer_interrupt+0x33/0x38
 [<c02690df>] cpuidle_idle_call+0x6f/0x90
 [<c01023c3>] cpu_idle+0x43/0x70
 [<c02d8c27>] rest_init+0x47/0x50
 [<c03bcb6a>] start_kernel+0x22a/0x2b0
 [<00000000>] 0x0
 =======================

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:15:57 -04:00
Trond Myklebust
45328c354e NFS: Fix NFSv4 open stateid regressions
Do not allow cached open for O_RDONLY or O_WRONLY unless the file has been
previously opened in these modes.

Also Fix the calculation of the mode in nfs4_close_prepare. We should only
issue an OPEN_DOWNGRADE if we're sure that we will still be holding the
correct open modes. This may not be the case if we've been doing delegated
opens.

Finally, there is no need to adjust the open mode bit flags in
nfs4_close_done(): that has already been done in nfs4_close_prepare().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:13:19 -04:00
Trond Myklebust
ba683031fa NFSv4: Fix a locking regression in nfs4_set_mode_locked()
We don't really need to clear &state->inode_states inside
nfs4_set_mode_locked, and doing so without holding the inode->i_lock would
in any case be a bug...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:13:18 -04:00
Trond Myklebust
5e11934d13 NFS: Fix put_nfs_open_context
We need to grab the inode->i_lock atomically with the last reference put in
order to remove the open context that is being freed from the
nfsi->open_files list.

Fix by converting the kref to a standard atomic counter and then using
atomic_dec_and_lock()...

Thanks to Arnd Bergmann for pointing out the problem.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:13:17 -04:00