linux/fs/dlm
David Teigland 222d396092 [DLM] fix master recovery
If master recovery happens on an rsb in one recovery sequence, then that
sequence is aborted before lock recovery happens, then in the next
sequence, we rely on the previous master recovery (which may now be
invalid due to another node ignoring a lookup result) and go on do to the
lock recovery where we get stuck due to an invalid master value.

 recovery cycle begins: master of rsb X has left
 nodes A and B send node C an rcom lookup for X to find the new master
 C gets lookup from B first, sets B as new master, and sends reply back to B
 C gets lookup from A next, and sends reply back to A saying B is master
 A gets lookup reply from C and sets B as the new master in the rsb
 recovery cycle on A, B and C is aborted to start a new recovery
 B gets lookup reply from C and ignores it since there's a new recovery
 recovery cycle begins: some other node has joined
 B doesn't think it's the master of X so it doesn't rebuild it in the directory
 C looks up the master of X, no one is master, so it becomes new master
 B looks up the master of X, finds it's C
 A believes that B is the master of X, so it sends its lock to B
 B sends an error back to A
 A resends
 this repeats forever, the incorrect master value on A is never corrected

The fix is to do master recovery on an rsb that still has the NEW_MASTER
flag set from an earlier recovery sequence, and therefore didn't complete
lock recovery.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:36:58 -05:00
..
ast.c [DLM] down conversion clearing flags 2006-08-23 16:07:31 -04:00
ast.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00
config.c [DLM] expose dlm_config_info fields in configfs 2007-02-05 13:36:43 -05:00
config.h [DLM] add config entry to enable log_debug 2007-02-05 13:36:40 -05:00
debug_fs.c [GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs) 2006-09-28 08:32:24 -04:00
dir.c [DLM] Update DLM to the latest patch level 2006-01-20 08:47:07 +00:00
dir.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00
dlm_internal.h [DLM] fix user unlocking 2007-02-05 13:36:55 -05:00
Kconfig [DLM] Fix DLM config 2006-11-30 10:35:41 -05:00
lock.c [DLM] fix user unlocking 2007-02-05 13:36:55 -05:00
lock.h [DLM] dump rsb and locks on assert 2006-08-21 09:50:09 -04:00
lockspace.c [DLM] rename dlm_config_info fields 2007-02-05 13:36:37 -05:00
lockspace.h [DLM] dlm: user locks 2006-07-13 09:25:34 -04:00
lowcomms-sctp.c [DLM] Use workqueues for dlm lowcomms 2007-02-05 13:36:52 -05:00
lowcomms-tcp.c [DLM] Use workqueues for dlm lowcomms 2007-02-05 13:36:52 -05:00
lowcomms.h [DLM] Clean up lowcomms 2006-12-07 09:25:13 -05:00
lvb_table.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00
main.c [DLM] Clean up lowcomms 2006-12-07 09:25:13 -05:00
Makefile [DLM] Add support for tcp communications 2006-11-30 10:35:00 -05:00
member.c [DLM] fix aborted recovery during node removal 2006-11-30 10:35:13 -05:00
member.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00
memory.c [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
memory.h [DLM] Remove range locks from the DLM 2006-02-23 09:56:38 +00:00
midcomms.c [DLM] rename dlm_config_info fields 2007-02-05 13:36:37 -05:00
midcomms.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00
rcom.c [DLM] rename dlm_config_info fields 2007-02-05 13:36:37 -05:00
rcom.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00
recover.c [DLM] fix master recovery 2007-02-05 13:36:58 -05:00
recover.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00
recoverd.c [DLM] change some log_error to log_debug 2007-02-05 13:36:34 -05:00
recoverd.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00
requestqueue.c [DLM] fix add_requestqueue checking nodes list 2006-11-30 10:37:00 -05:00
requestqueue.h [DLM] fix requestqueue race 2006-11-30 10:35:10 -05:00
user.c [DLM] fix user unlocking 2007-02-05 13:36:55 -05:00
user.h [DLM] dlm: user locks 2006-07-13 09:25:34 -04:00
util.c [DLM] fix old rcom messages 2007-02-05 13:35:50 -05:00
util.h [DLM] The core of the DLM for GFS2/CLVM 2006-01-18 09:30:29 +00:00