mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-11-24 12:44:11 +08:00
ocfs2/cluster: close a race that fence can't be triggered
When some nodes of cluster face with TCP connection fault, ocfs2 will pick up a quorum to continue to work and other nodes will be fenced by resetting host. In order to decide which node should be fenced, ocfs2 leverages o2quo_state::qs_holds. If that variable is reduced to zero, then a try to decide if fence local node is performed. However, under a specific scenario that local node is not disconnected from others at the same time, above method has a problem to reduce ::qs_holds to zero. Because, o2net 90s idle timer corresponding to different nodes is triggered one after another. node 2 node 3 90s idle timer elapses clear ::qs_conn_bm set hold 40s is passed 90 idle timer elapses clear ::qs_conn_bm set hold still up timer elapses clear hold (NOT to zero ) 90s idle timer elapses AGAIN still up timer elapses. clear hold still up timer elapses To solve this issue, a node which has already be evicted from ::qs_conn_bm can't set hold again and again invoked from idle timer. Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373F1F3F93B@H3CMLB12-EX.srv.huawei-3com.com Signed-off-by: Yang Zhang <zhang.yangB@h3c.com> Signed-off-by: Changwei Ge <ge.changwei@h3c.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
a52370b3b1
commit
fc2af28bd9
@ -314,12 +314,13 @@ void o2quo_conn_err(u8 node)
|
||||
node, qs->qs_connected);
|
||||
|
||||
clear_bit(node, qs->qs_conn_bm);
|
||||
|
||||
if (test_bit(node, qs->qs_hb_bm))
|
||||
o2quo_set_hold(qs, node);
|
||||
}
|
||||
|
||||
mlog(0, "node %u, %d total\n", node, qs->qs_connected);
|
||||
|
||||
if (test_bit(node, qs->qs_hb_bm))
|
||||
o2quo_set_hold(qs, node);
|
||||
|
||||
spin_unlock(&qs->qs_lock);
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user