linux/fs/ocfs2
piaojun 86b652b93a ocfs2/dlm: disable BUG_ON when DLM_LOCK_RES_DROPPING_REF is cleared before dlm_deref_lockres_done_handler
We found a BUG situation in which DLM_LOCK_RES_DROPPING_REF is cleared
unexpected that described below.  To solve the bug, we disable the
BUG_ON and purge lockres in dlm_do_local_recovery_cleanup.

Node 1                               Node 2(master)
dlm_purge_lockres
                                     dlm_deref_lockres_handler

                                     DLM_LOCK_RES_SETREF_INPROG is set
                                     response DLM_DEREF_RESPONSE_INPROG

receive DLM_DEREF_RESPONSE_INPROG
stop puring in dlm_purge_lockres
and wait for DLM_DEREF_RESPONSE_DONE

                                     dispatch dlm_deref_lockres_worker
                                     response DLM_DEREF_RESPONSE_DONE

receive DLM_DEREF_RESPONSE_DONE and
prepare to purge lockres

                                     Node 2 goes down

find Node2 down and do local
clean up for Node2:
dlm_do_local_recovery_cleanup
  -> clear DLM_LOCK_RES_DROPPING_REF

when purging lockres, BUG_ON happens
because DLM_LOCK_RES_DROPPING_REF is clear:
dlm_deref_lockres_done_handler
  ->BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));

[akpm@linux-foundation.org: fix duplicated write to `ret']
Fixes: 60d663cb52 ("ocfs2/dlm: add DEREF_DONE message")
Link: http://lkml.kernel.org/r/57845055.9080702@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-02 17:31:41 -04:00
..
cluster Merge branch 'akpm' (patches from Andrew) 2016-07-26 19:55:54 -07:00
dlm ocfs2/dlm: disable BUG_ON when DLM_LOCK_RES_DROPPING_REF is cleared before dlm_deref_lockres_done_handler 2016-08-02 17:31:41 -04:00
dlmfs mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
acl.c ocfs2: fix posix_acl_create deadlock 2016-05-12 15:52:50 -07:00
acl.h ocfs2: fix posix_acl_create deadlock 2016-05-12 15:52:50 -07:00
alloc.c ocfs2: retry on ENOSPC if sufficient space in truncate log 2016-08-02 17:31:41 -04:00
alloc.h ocfs2: retry on ENOSPC if sufficient space in truncate log 2016-08-02 17:31:41 -04:00
aops.c ocfs2: retry on ENOSPC if sufficient space in truncate log 2016-08-02 17:31:41 -04:00
aops.h ocfs2: fix ip_unaligned_aio deadlock with dio work queue 2016-03-25 16:37:42 -07:00
blockcheck.c
blockcheck.h
buffer_head_io.c Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-block 2016-07-26 15:03:07 -07:00
buffer_head_io.h
dcache.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
dcache.h ocfs2: revert iput deferring code in ocfs2_drop_dentry_lock 2014-04-03 16:20:55 -07:00
dir.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
dir.h VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
dlmglue.c ocfs2: remove obscure BUG_ON in dlmglue 2016-07-26 16:19:19 -07:00
dlmglue.h ocfs2: avoid blocking in ocfs2_mark_lockres_freeing() in downconvert thread 2014-04-03 16:20:55 -07:00
export.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
export.h
extent_map.c ocfs2: neaten do_error, ocfs2_error and ocfs2_abort 2015-09-04 16:54:41 -07:00
extent_map.h
file.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-17 11:01:31 -07:00
file.h ocfs2: prepare some interfaces used in append direct io 2015-02-16 17:56:04 -08:00
filecheck.c ocfs2: sysfile interfaces for online file check 2016-03-22 15:36:02 -07:00
filecheck.h ocfs2: sysfile interfaces for online file check 2016-03-22 15:36:02 -07:00
heartbeat.c
heartbeat.h
inode.c ocfs2: fix improper handling of return errno 2016-05-26 15:35:44 -07:00
inode.h ocfs2: cleanup implemented prototypes 2016-07-26 16:19:19 -07:00
ioctl.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ioctl.h
journal.c ocfs2: improve recovery performance 2016-07-26 16:19:19 -07:00
journal.h jbd2: add support for avoiding data writes during transaction commits 2016-04-24 00:56:07 -04:00
Kconfig
localalloc.c ocfs2: fix occurring deadlock by changing ocfs2_wq from global to local 2016-03-25 16:37:42 -07:00
localalloc.h ocfs2: free allocated clusters if error occurs after ocfs2_claim_clusters 2014-02-06 13:48:51 -08:00
locks.c ocfs2: fix flock panic issue 2015-12-29 17:45:49 -08:00
locks.h
Makefile ocfs2: disable BUG assertions in reading blocks 2016-06-24 17:23:52 -07:00
mmap.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
mmap.h
move_extents.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
move_extents.h
namei.c ocfs2: fix posix_acl_create deadlock 2016-05-12 15:52:50 -07:00
namei.h ocfs2: do not include dio entry in case of orphan scan 2015-11-05 19:34:48 -08:00
ocfs1_fs_compat.h
ocfs2_fs.h ocfs2: fix comment in struct ocfs2_extended_slot 2016-05-19 19:12:14 -07:00
ocfs2_ioctl.h
ocfs2_lockid.h
ocfs2_lockingver.h
ocfs2_trace.h ocfs2: code clean up for direct io 2016-03-25 16:37:42 -07:00
ocfs2.h mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
quota_global.c quota: use time64_t internally 2016-06-19 18:09:31 +02:00
quota_local.c ocfs2: neaten do_error, ocfs2_error and ocfs2_abort 2015-09-04 16:54:41 -07:00
quota.h quota: constify qtree_fmt_operations structures 2016-01-04 10:58:35 +01:00
refcounttree.c ocfs2: fix posix_acl_create deadlock 2016-05-12 15:52:50 -07:00
refcounttree.h ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page 2013-08-13 17:57:49 -07:00
reservations.c ocfs2: make resv_lock spinlock static 2015-02-10 14:30:29 -08:00
reservations.h
resize.c ocfs2: solve a problem of crossing the boundary in updating backups 2016-03-25 16:37:42 -07:00
resize.h
slot_map.c ocfs2: clean up an unneeded goto in ocfs2_put_slot() 2016-05-19 19:12:14 -07:00
slot_map.h
stack_o2cb.c ocfs2: avoid a pointless delay in o2cb_cluster_check() 2015-04-14 16:48:57 -07:00
stack_user.c ocfs2: ensure that dlm lockspace is created by kernel module 2016-08-02 17:31:41 -04:00
stackglue.c ocfs2: fix a redundant re-initialization 2016-07-26 16:19:19 -07:00
stackglue.h ocfs2: export ocfs2_kset for online file check 2016-03-22 15:36:02 -07:00
suballoc.c ocfs2: retry on ENOSPC if sufficient space in truncate log 2016-08-02 17:31:41 -04:00
suballoc.h ocfs2: rollback alloc_dinode counts when ocfs2_block_group_set_bits() failed 2014-04-03 16:20:56 -07:00
super.c Merge branch 'akpm' (patches from Andrew) 2016-07-26 19:55:54 -07:00
super.h ocfs2: fix occurring deadlock by changing ocfs2_wq from global to local 2016-03-25 16:37:42 -07:00
symlink.c switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
symlink.h
sysfile.c ocfs2: avoid system inode ref confusion by adding mutex lock 2014-04-03 16:20:57 -07:00
sysfile.h
uptodate.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
uptodate.h
xattr.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-07-28 14:22:25 -07:00
xattr.h ocfs2: fix posix_acl_create deadlock 2016-05-12 15:52:50 -07:00