linux/fs
Bob Peterson f90e5b5b13 GFS2: Processes waiting on inode glock that no processes are holding
This patch fixes a race in the GFS2 glock state machine that may
result in lockups.  The symptom is that all nodes but one will
hang, waiting for a particular glock.  All the holder records
will have the "W" (Waiting) bit set.  The other node will
typically have the glock stuck in Exclusive mode (EX) with no
holder records, but the dinode will be cached.  In other words,
an entry with "I:" will appear in the glock dump for that glock,
but nothing else.

The race has to do with the glock "Pending Demote" bit, which
can be set, then immediately reset, thus losing the fact that
another node needs the glock.  The sequence of events is:

1. Something schedules the glock workqueue (e.g. glock request from fs)
2. The glock workqueue gets to the point between the test of the reply pending
bit and the spin lock:

        if (test_and_clear_bit(GLF_REPLY_PENDING, &gl->gl_flags)) {
                finish_xmote(gl, gl->gl_reply);
                drop_ref = 1;
        }
        down_read(&gfs2_umount_flush_sem);         <---- i.e. here
        spin_lock(&gl->gl_spin);

3. In comes (a) the reply to our EX lock request setting GLF_REPLY_PENDING and
            (b) the demote request which sets GLF_PENDING_DEMOTE

4. The following test is executed:

        if (test_and_clear_bit(GLF_PENDING_DEMOTE, &gl->gl_flags) &&
            gl->gl_state != LM_ST_UNLOCKED &&
            gl->gl_demote_state != LM_ST_EXCLUSIVE) {

This resets the pending demote flag, and gl->gl_demote_state is not equal to
exclusive, however because the reply from the dlm arrived after we checked for
the GLF_REPLY_PENDING flag, gl->gl_state is still equal to unlocked, so
although we reset the GLF_PENDING_DEMOTE flag, we didn't then set the
GLF_DEMOTE flag or reinstate the GLF_PENDING_DEMOTE_FLAG.

The patch closes the timing window by only transitioning the
"Pending demote" bit to the "demote" flag once we know the
other conditions (not unlocked and not exclusive) are met.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2011-05-25 10:37:11 +01:00
..
9p fs/9p: Fix error reported by coccicheck 2011-04-15 15:26:14 -05:00
adfs Fix common misspellings 2011-03-31 11:26:23 -03:00
affs Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
afs Fix common misspellings 2011-03-31 11:26:23 -03:00
autofs4 Fix common misspellings 2011-03-31 11:26:23 -03:00
befs Fix common misspellings 2011-03-31 11:26:23 -03:00
bfs Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-05-23 09:12:26 -07:00
cachefiles Fix common misspellings 2011-03-31 11:26:23 -03:00
ceph ceph: do not use i_wrbuffer_ref as refcount for Fb cap 2011-05-11 10:44:48 -07:00
cifs [CIFS] Fix to problem with getattr caused by invalidate simplification patch 2011-05-20 17:00:01 +00:00
coda codafs: fix build break when CONFIG_PROC_SYSCTL=n 2011-03-25 17:45:16 -07:00
configfs configfs: Fix race between configfs_readdir() and configfs_d_iput() 2011-05-18 04:08:16 -07:00
cramfs cramfs: generate unique inode number for better inode cache usage 2011-01-13 08:03:23 -08:00
debugfs debugfs: move to new strtobool 2011-05-19 16:55:28 +09:30
devpts fs/devpts/inode.c: correctly check d_alloc_name() return code in devpts_pty_new() 2011-03-22 17:44:17 -07:00
dlm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm 2011-05-24 15:04:00 -07:00
ecryptfs eCryptfs: Flush dirty pages in setattr 2011-04-25 18:49:46 -05:00
efs block: remove per-queue plugging 2011-03-10 08:52:07 +01:00
exofs Fix common misspellings 2011-03-31 11:26:23 -03:00
exportfs vfs: Add open by file handle support 2011-03-15 02:21:44 -04:00
ext2 ext2: fix error msg when mounting fs with too-large blocksize 2011-05-17 13:47:42 +02:00
ext3 ext3: Fix fs corruption when make_indexed_dir() fails 2011-05-17 13:47:15 +02:00
ext4 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 2011-04-11 15:45:47 -07:00
fat fat: Fix statfs->f_namelen 2011-04-12 21:12:50 +09:00
freevxfs treewide: fix a few typos in comments 2011-05-10 10:16:21 +02:00
fscache FS-Cache: Fix operation handling 2011-01-14 09:23:36 -08:00
fuse fuse: fix oops in revalidate when called with NULL nameidata 2011-05-10 17:35:58 +02:00
gfs2 GFS2: Processes waiting on inode glock that no processes are holding 2011-05-25 10:37:11 +01:00
hfs Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
hfsplus Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
hostfs switch hostfs 2011-01-12 20:03:42 -05:00
hpfs HPFS: Remove unused variable 2011-05-09 09:04:24 -07:00
hppfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
hugetlbfs mm: hugetlbfs: change remove_from_page_cache 2011-03-22 17:44:02 -07:00
isofs Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
jbd jbd: Fix comment to match the code in journal_start() 2011-05-24 00:27:53 +02:00
jbd2 jbd/jbd2: remove obsolete summarise_journal_usage. 2011-05-17 13:47:42 +02:00
jffs2 Fix common misspellings 2011-03-31 11:26:23 -03:00
jfs Fix common misspellings 2011-03-31 11:26:23 -03:00
lockd NLM: Fix "kernel BUG at fs/lockd/host.c:417!" or ".../host.c:283!" 2011-01-25 15:24:47 -05:00
logfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-05-23 09:12:26 -07:00
minix Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
ncpfs Fix common misspellings 2011-03-31 11:26:23 -03:00
nfs NFSv4.1: Ensure that layoutget uses the correct gfp modes 2011-05-11 22:52:13 -04:00
nfs_common Fix common misspellings 2011-03-31 11:26:23 -03:00
nfsd treewide: fix a few typos in comments 2011-05-10 10:16:21 +02:00
nilfs2 nilfs2: use mark_buffer_dirty to mark btnode or meta data dirty 2011-05-10 22:21:57 +09:00
nls
notify Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6 2011-04-07 11:14:49 -07:00
ntfs Fix common misspellings 2011-03-31 11:26:23 -03:00
ocfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-05-23 09:12:26 -07:00
omfs Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
openpromfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
partitions Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2011-05-23 09:12:26 -07:00
proc Don't lock guardpage if the stack is growing up 2011-05-09 16:22:07 -07:00
pstore pstore: fix pstore filesystem mount/remount issue 2011-05-16 11:05:00 -07:00
qnx4 block: remove per-queue plugging 2011-03-10 08:52:07 +01:00
quota Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 2011-04-08 07:35:17 -07:00
ramfs ramfs: fix memleak on no-mmu arch 2011-04-14 16:06:56 -07:00
reiserfs Fix common misspellings 2011-03-31 11:26:23 -03:00
romfs fs: icache RCU free inodes 2011-01-07 17:50:26 +11:00
squashfs treewide: fix a few typos in comments 2011-05-10 10:16:21 +02:00
sysfs sysfs: remove "last sysfs file:" line from the oops messages 2011-05-13 16:05:51 -07:00
sysv Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
ubifs UBIFS: switch to dynamic printks 2011-05-23 08:22:20 +03:00
udf Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
ufs Merge branch 'master' into for-next 2011-04-26 10:22:59 +02:00
xfs Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs 2011-05-23 15:19:16 -07:00
aio.c Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
anon_inodes.c sanitize vfsmount refcounting changes 2011-01-16 13:47:07 -05:00
attr.c Fix common misspellings 2011-03-31 11:26:23 -03:00
bad_inode.c fs: provide rcu-walk aware permission i_ops 2011-01-07 17:50:29 +11:00
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c brk: COMPAT_BRK: fix detection of randomized brk 2011-04-14 16:06:55 -07:00
binfmt_em86.c
binfmt_flat.c CRED: Fix load_flat_shared_library() to initialise bprm correctly 2011-05-03 10:10:51 +10:00
binfmt_misc.c convert get_sb_single() users 2010-10-29 04:16:28 -04:00
binfmt_script.c
binfmt_som.c
bio-integrity.c block: Require subsystems to explicitly allocate bio_set integrity mempool 2011-03-17 11:11:05 +01:00
bio.c Fix common misspellings 2011-03-31 11:26:23 -03:00
block_dev.c block: move bd_set_size() above rescan_partitions() in __blkdev_get() 2011-05-23 08:50:48 -07:00
buffer.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2011-03-24 19:01:30 -07:00
char_dev.c Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block 2011-01-13 10:45:01 -08:00
compat_binfmt_elf.c
compat_ioctl.c Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 2011-01-07 14:39:20 -08:00
compat.c exec: unify do_execve/compat_do_execve code 2011-04-09 15:53:56 +02:00
dcache.c sanitize <linux/prefetch.h> usage 2011-05-20 12:50:29 -07:00
dcookies.c
direct-io.c Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
drop_caches.c fs: move i_sb_list out from under inode_lock 2011-03-24 21:16:32 -04:00
eventfd.c Docbook: add fs/eventfd.c and fix typos in it 2011-02-21 15:07:04 -08:00
eventpoll.c Fix common misspellings 2011-03-31 11:26:23 -03:00
exec.c Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6 2011-05-23 12:23:20 -07:00
fcntl.c userns: rename is_owner_or_cap to inode_owner_or_capable 2011-03-23 19:47:13 -07:00
fhandle.c fs/fhandle.c: add <linux/personality.h> for ia64 2011-04-14 16:06:56 -07:00
fifo.c Filesystem: fifo: Fixed coding style issue. 2011-03-21 00:16:09 -04:00
file_table.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2011-03-16 13:26:17 -07:00
file.c vfs: avoid large kmalloc()s for the fdtable 2011-04-28 11:28:20 -07:00
filesystems.c fs: synchronize_rcu when unregister_filesystem success not failure 2011-04-17 10:42:01 -07:00
fs_struct.c sanitize vfsmount refcounting changes 2011-01-16 13:47:07 -05:00
fs-writeback.c Fix common misspellings 2011-03-31 11:26:23 -03:00
generic_acl.c userns: rename is_owner_or_cap to inode_owner_or_capable 2011-03-23 19:47:13 -07:00
inode.c fs: add missing prefetch.h include 2011-05-22 11:26:02 -07:00
internal.h fs: move i_wb_list out from under inode_lock 2011-03-24 21:17:51 -04:00
ioctl.c vfs: cleanup do_vfs_ioctl() 2011-03-21 00:16:08 -04:00
ioprio.c ioprio: grab rcu_read_lock in sys_ioprio_{set,get}() 2010-11-15 10:23:31 +01:00
Kconfig Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 2011-03-16 19:01:29 -07:00
Kconfig.binfmt coredump: default CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y 2010-10-27 18:03:12 -07:00
libfs.c pass default dentry_operations to mount_pseudo() 2011-01-12 20:03:43 -05:00
locks.c Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux 2011-03-24 08:20:39 -07:00
Makefile Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 2011-03-16 19:01:29 -07:00
mbcache.c Fix common misspellings 2011-03-31 11:26:23 -03:00
mpage.c fs: make mpage read/write_pages() plug 2011-03-10 08:52:26 +01:00
namei.c VFS: move BUG_ON test for symlink nd->depth after current->link_count test 2011-05-21 00:12:16 -07:00
namespace.c Revert "vfs: Export file system uuid via /proc/<pid>/mountinfo" 2011-04-12 13:35:56 -07:00
nfsctl.c open-style analog of vfs_path_lookup() 2011-03-14 09:15:28 -04:00
no-block.c
open.c fs: Use BUG_ON(!mnt) at dentry_open(). 2011-03-21 01:10:41 -04:00
pipe.c Fix broken "pipe: use event aware wakeups" optimization 2011-01-20 16:21:59 -08:00
pnode.c fs: scale mntget/mntput 2011-01-07 17:50:33 +11:00
pnode.h
posix_acl.c NFS: Prevent memory allocation failure in nfsacl_encode() 2011-01-25 15:24:47 -05:00
read_write.c fix signedness mess in rw_verify_area() on 64bit architectures 2011-01-12 20:06:58 -05:00
read_write.h
readdir.c
select.c select: remove unused MAX_SELECT_SECONDS 2011-03-21 00:16:08 -04:00
seq_file.c fs: take dcache_lock inside __d_path 2010-10-25 21:26:12 -04:00
signalfd.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2010-10-26 10:13:10 -07:00
splice.c Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block 2011-01-13 10:45:01 -08:00
stack.c
stat.c readlinkat(), fchownat() and fstatat() with empty relative pathnames 2011-03-15 02:21:45 -04:00
statfs.c clean statfs-like syscalls up 2011-03-14 09:15:28 -04:00
super.c VFS: trivial: fix comment on s_maxbytes value warning check 2011-05-19 14:10:49 +00:00
sync.c Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block 2011-03-24 10:16:26 -07:00
timerfd.c timerfd: Manage cancelable timers in timerfd 2011-05-23 13:59:53 +02:00
utimes.c userns: rename is_owner_or_cap to inode_owner_or_capable 2011-03-23 19:47:13 -07:00
xattr_acl.c
xattr.c vfs: Pass setxattr(2) flags properly 2011-04-21 07:34:44 -07:00