linux/fs
Chuck Lever 0aaaf5c424 NFS: Cache state owners after files are closed
Servers have a finite amount of memory to store NFSv4 open and lock
owners.  Moreover, servers may have a difficult time determining when
they can reap their state owner table, thanks to gray areas in the
NFSv4 protocol specification.  Thus clients should be careful to reuse
state owners when possible.

Currently Linux is not too careful.  When a user has closed all her
files on one mount point, the state owner's reference count goes to
zero, and it is released.  The next OPEN allocates a new one.  A
workload that serially opens and closes files can run through a large
number of open owners this way.

When a state owner's reference count goes to zero, slap it onto a free
list for that nfs_server, with an expiry time.  Garbage collect before
looking for a state owner.  This makes state owners for active users
available for re-use.

Now that there can be unused state owners remaining at umount time,
purge the state owner free list when a server is destroyed.  Also be
sure not to reclaim unused state owners during state recovery.

This change has benefits for the client as well.  For some workloads,
this approach drops the number of OPEN_CONFIRM calls from the same as
the number of OPEN calls, down to just one.  This reduces wire traffic
and thus open(2) latency.  Before this patch, untarring a kernel
source tarball shows the OPEN_CONFIRM call counter steadily increasing
through the test.  With the patch, the OPEN_CONFIRM count remains at 1
throughout the entire untar.

As long as the expiry time is kept short, I don't think garbage
collection should be terribly expensive, although it does bounce the
clp->cl_lock around a bit.

[ At some point we should rationalize the use of the nfs_server
->destroy method. ]

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[Trond: Fixed a garbage collection race and a few efficiency issues]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 11:59:18 -05:00
..
9p filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
adfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
affs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
afs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
autofs4 filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
befs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
bfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2011-12-23 14:58:39 -08:00
cachefiles kill useless checks for sb->s_op == NULL 2011-07-20 01:44:21 -04:00
ceph ceph: disable use of dcache for readdir etc. 2011-12-29 08:05:14 -08:00
cifs [CIFS] default ntlmv2 for cifs mount delayed to 3.3 2012-01-04 07:54:40 -06:00
coda filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
configfs configfs: register_filesystem() called too early 2011-12-13 12:35:15 -05:00
cramfs cramfs: get_cramfs_inode() returns ERR_PTR() on failure 2011-07-17 23:22:02 -04:00
debugfs debugfs: Fix a comment mistake 2011-08-22 17:41:48 -07:00
devpts filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
dlm Merge branch 'for-3.1' of git://linux-nfs.org/~bfields/linux 2011-07-25 22:49:19 -07:00
ecryptfs eCryptfs: Extend array bounds for all filename chars 2011-11-23 15:43:53 -06:00
efs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
exofs Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
exportfs
ext2 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue 2011-11-02 11:41:01 -07:00
ext3 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue 2011-11-02 11:41:01 -07:00
ext4 ext4: handle EOF correctly in ext4_bio_write_page() 2011-12-13 22:29:12 -05:00
fat filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
freevxfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
fscache FS-Cache: Fix __fscache_uncache_all_inode_pages()'s outer loop 2011-07-21 10:59:16 -07:00
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse 2011-12-14 18:23:35 -08:00
gfs2 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
hfs hfs: add sanity check for file name length 2011-11-15 14:29:42 -02:00
hfsplus filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
hostfs Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue 2011-11-02 11:41:01 -07:00
hpfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
hppfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
hugetlbfs filesystems: add missing nlink wrappers 2011-11-02 12:53:43 +01:00
isofs Merge branch 'akpm' (Andrew's incoming - part two) 2011-11-02 16:07:27 -07:00
jbd jbd/jbd2: validate sb->s_first in journal_get_superblock() 2011-11-01 19:04:59 -04:00
jbd2 jbd2: Unify log messages in jbd2 code 2011-11-01 19:09:18 -04:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2011-11-07 09:11:16 -08:00
jfs Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
lockd SUNRPC: Replace svc_addr_u by sockaddr_storage 2011-09-14 08:21:48 -04:00
logfs Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
minix minixfs: misplaced checks lead to dentry leak 2012-01-04 15:03:06 -08:00
ncpfs fs/ncpfs: fix error paths and goto statements in ncp_fill_super() 2011-12-14 00:45:33 -05:00
nfs NFS: Cache state owners after files are closed 2012-01-05 11:59:18 -05:00
nfs_common
nfsd SUNRPC: Clean up the RPCSEC_GSS service ticket requests 2012-01-05 10:42:38 -05:00
nilfs2 nilfs2: potential integer overflow in nilfs_ioctl_clean_segments() 2011-12-20 10:25:04 -08:00
nls
notify atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
ntfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
ocfs2 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 2011-12-01 14:55:34 -08:00
omfs omfs: fix (mode & S_IFDIR) abuse 2011-07-26 13:05:28 -04:00
openpromfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
partitions treewide: use __printf not __attribute__((format(printf,...))) 2011-10-31 17:30:54 -07:00
proc procfs: do not confuse jiffies with cputime64_t 2011-12-29 16:31:57 -08:00
pstore pstore: pass allocated memory region back to caller 2011-11-17 12:58:07 -08:00
qnx4 filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
quota Merge branch 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux 2011-11-06 19:02:23 -08:00
ramfs ramfs: remove module leftovers 2011-11-02 16:06:58 -07:00
reiserfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
romfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
squashfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next 2011-11-04 16:48:37 -07:00
sysfs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
sysv filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
ubifs ubifs: too early register_filesystem() 2011-12-13 12:35:13 -05:00
udf Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue 2011-11-02 11:41:01 -07:00
ufs filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
xfs xfs: log all dirty inodes in xfs_fs_sync_fs 2011-12-23 16:41:47 -06:00
aio.c aio: allocate kiocbs in batches 2011-11-02 16:07:03 -07:00
anon_inodes.c vfs: dont chain pipe/anon/socket on superblock s_inodes list 2011-07-26 12:57:09 -04:00
attr.c Merge branch 'next-evm' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/ima-2.6 into next 2011-08-09 10:31:03 +10:00
bad_inode.c fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers 2011-07-20 20:47:59 -04:00
binfmt_aout.c
binfmt_elf_fdpic.c consolidate BINPRM_FLAGS_ENFORCE_NONDUMP handling 2011-07-20 01:43:10 -04:00
binfmt_elf.c binfmt_elf: fix PIE execution with randomization disabled 2011-11-02 16:06:58 -07:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c filesystems: add missing nlink wrappers 2011-11-02 12:53:43 +01:00
binfmt_script.c
binfmt_som.c
bio-integrity.c fs: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros 2011-10-31 19:30:31 -04:00
bio.c bio: change some signed vars to unsigned 2011-11-16 09:21:50 +01:00
block_dev.c Merge branch 'for-3.2/drivers' of git://git.kernel.dk/linux-block 2011-11-04 17:22:14 -07:00
buffer.c Merge branch 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux 2011-11-06 19:02:23 -08:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c compat_ioctl: add compat handler for PPPIOCGL2TPSTATS 2011-08-07 22:24:41 -07:00
compat.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
dcache.c fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API 2011-12-06 23:57:18 -05:00
dcookies.c
direct-io.c direct-io: merge direct_io_walker into __blockdev_direct_IO 2011-10-28 14:58:58 +02:00
drop_caches.c
eventfd.c
eventpoll.c epoll: fix spurious lockdep warnings 2011-10-31 17:30:57 -07:00
exec.c oom: remove oom_disable_count 2011-10-31 17:30:45 -07:00
fcntl.c
fhandle.c
fifo.c
file_table.c atomic: use <linux/atomic.h> 2011-07-26 16:49:47 -07:00
file.c
filesystems.c
fs_struct.c
fs-writeback.c writeback: show writeback reason with __print_symbolic 2011-12-18 14:20:17 +08:00
generic_acl.c switch posix_acl_equiv_mode() to umode_t * 2011-08-01 02:10:06 -04:00
inode.c vfs: protect i_nlink 2011-11-02 12:53:43 +01:00
internal.h superblock: move pin_sb_for_writeback() to fs/super.c 2011-07-20 01:44:38 -04:00
ioctl.c
ioprio.c fs: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros 2011-10-31 19:30:31 -04:00
Kconfig tmpfs: add "tmpfs" to the Kconfig prompt to make it obvious. 2011-10-31 17:30:45 -07:00
Kconfig.binfmt
libfs.c filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
locks.c vfs: fix handling of lock allocation failure in lease-break case 2011-12-26 10:25:26 -08:00
Makefile fs/Makefile: Stupid typo breakage of exofs inclusion 2011-10-27 08:36:51 +02:00
mbcache.c
mpage.c
namei.c VFS: we need to set LOOKUP_JUMPED on mountpoint crossing 2011-11-07 14:58:06 -08:00
namespace.c fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API 2011-12-06 23:57:18 -05:00
no-block.c
open.c leases: fix write-open/read-lease race 2011-10-28 14:59:00 +02:00
pipe.c fs/pipe.c: add ->statfs callback for pipefs 2011-10-31 17:30:51 -07:00
pnode.c
pnode.h
posix_acl.c vfs: pass all mask flags check_acl and posix_acl_permission 2011-10-28 14:58:54 +02:00
read_write.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
read_write.h
readdir.c
select.c
seq_file.c fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API 2011-12-06 23:57:18 -05:00
signalfd.c
splice.c tmpfs: clone shmem_file_splice_read() 2011-07-25 20:57:11 -07:00
stack.c filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
stat.c readlinkat: ensure we return ENOENT for the empty pathname for normal lookups 2011-11-02 12:53:42 +01:00
statfs.c VFS: fix statfs() automounter semantics regression 2011-11-04 18:15:59 -07:00
super.c vfs: ignore error on forced remount 2011-11-02 12:53:42 +01:00
sync.c writeback: Add a 'reason' to wb_writeback_work 2011-10-31 00:33:36 +08:00
timerfd.c
utimes.c
xattr_acl.c
xattr.c evm: evm_inode_post_removexattr 2011-07-18 12:29:43 -04:00