2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-16 17:23:55 +08:00
linux-next/net/tipc
Ying Xue b952b2befb tipc: fix potential deadlock when all links are reset
[   60.988363] ======================================================
[   60.988754] [ INFO: possible circular locking dependency detected ]
[   60.989152] 3.19.0+ #194 Not tainted
[   60.989377] -------------------------------------------------------
[   60.989781] swapper/3/0 is trying to acquire lock:
[   60.990079]  (&(&n_ptr->lock)->rlock){+.-...}, at: [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc]
[   60.990743]
[   60.990743] but task is already holding lock:
[   60.991106]  (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc]
[   60.991738]
[   60.991738] which lock already depends on the new lock.
[   60.991738]
[   60.992174]
[   60.992174] the existing dependency chain (in reverse order) is:
[   60.992174]
-> #1 (&(&bclink->lock)->rlock){+.-...}:
[   60.992174]        [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140
[   60.992174]        [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50
[   60.992174]        [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc]
[   60.992174]        [<ffffffffa0000f57>] tipc_bclink_add_node+0x97/0xf0 [tipc]
[   60.992174]        [<ffffffffa0011815>] tipc_node_link_up+0xf5/0x110 [tipc]
[   60.992174]        [<ffffffffa0007783>] link_state_event+0x2b3/0x4f0 [tipc]
[   60.992174]        [<ffffffffa00193c0>] tipc_link_proto_rcv+0x24c/0x418 [tipc]
[   60.992174]        [<ffffffffa0008857>] tipc_rcv+0x827/0xac0 [tipc]
[   60.992174]        [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc]
[   60.992174]        [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980
[   60.992174]        [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70
[   60.992174]        [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130
[   60.992174]        [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0
[   60.992174]        [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490
[   60.992174]        [<ffffffff8155c1b7>] e1000_clean+0x267/0x990
[   60.992174]        [<ffffffff81647b60>] net_rx_action+0x150/0x360
[   60.992174]        [<ffffffff8105ec43>] __do_softirq+0x123/0x360
[   60.992174]        [<ffffffff8105f12e>] irq_exit+0x8e/0xb0
[   60.992174]        [<ffffffff8179f9f5>] do_IRQ+0x65/0x110
[   60.992174]        [<ffffffff8179da6f>] ret_from_intr+0x0/0x13
[   60.992174]        [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20
[   60.992174]        [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0
[   60.992174]        [<ffffffff81033cda>] start_secondary+0x13a/0x150
[   60.992174]
-> #0 (&(&n_ptr->lock)->rlock){+.-...}:
[   60.992174]        [<ffffffff810a8f7d>] __lock_acquire+0x163d/0x1ca0
[   60.992174]        [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140
[   60.992174]        [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50
[   60.992174]        [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc]
[   60.992174]        [<ffffffffa0001e11>] tipc_bclink_rcv+0x611/0x640 [tipc]
[   60.992174]        [<ffffffffa0008646>] tipc_rcv+0x616/0xac0 [tipc]
[   60.992174]        [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc]
[   60.992174]        [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980
[   60.992174]        [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70
[   60.992174]        [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130
[   60.992174]        [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0
[   60.992174]        [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490
[   60.992174]        [<ffffffff8155c1b7>] e1000_clean+0x267/0x990
[   60.992174]        [<ffffffff81647b60>] net_rx_action+0x150/0x360
[   60.992174]        [<ffffffff8105ec43>] __do_softirq+0x123/0x360
[   60.992174]        [<ffffffff8105f12e>] irq_exit+0x8e/0xb0
[   60.992174]        [<ffffffff8179f9f5>] do_IRQ+0x65/0x110
[   60.992174]        [<ffffffff8179da6f>] ret_from_intr+0x0/0x13
[   60.992174]        [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20
[   60.992174]        [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0
[   60.992174]        [<ffffffff81033cda>] start_secondary+0x13a/0x150
[   60.992174]
[   60.992174] other info that might help us debug this:
[   60.992174]
[   60.992174]  Possible unsafe locking scenario:
[   60.992174]
[   60.992174]        CPU0                    CPU1
[   60.992174]        ----                    ----
[   60.992174]   lock(&(&bclink->lock)->rlock);
[   60.992174]                                lock(&(&n_ptr->lock)->rlock);
[   60.992174]                                lock(&(&bclink->lock)->rlock);
[   60.992174]   lock(&(&n_ptr->lock)->rlock);
[   60.992174]
[   60.992174]  *** DEADLOCK ***
[   60.992174]
[   60.992174] 3 locks held by swapper/3/0:
[   60.992174]  #0:  (rcu_read_lock){......}, at: [<ffffffff81646791>] __netif_receive_skb_core+0x71/0x980
[   60.992174]  #1:  (rcu_read_lock){......}, at: [<ffffffffa0002c35>] tipc_l2_rcv_msg+0x5/0xd0 [tipc]
[   60.992174]  #2:  (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc]
[   60.992174]

The correct the sequence of grabbing n_ptr->lock and bclink->lock
should be that the former is first held and the latter is then taken,
which exactly happened on CPU1. But especially when the retransmission
of broadcast link is failed, bclink->lock is first held in
tipc_bclink_rcv(), and n_ptr->lock is taken in link_retransmit_failure()
called by tipc_link_retransmit() subsequently, which is demonstrated on
CPU0. As a result, deadlock occurs.

If the order of holding the two locks happening on CPU0 is reversed, the
deadlock risk will be relieved. Therefore, the node lock taken in
link_retransmit_failure() originally is moved to tipc_bclink_rcv()
so that it's obtained before bclink lock. But the precondition of
the adjustment of node lock is that responding to bclink reset event
must be moved from tipc_bclink_unlock() to tipc_node_unlock().

Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-29 12:40:27 -07:00
..
addr.c tipc: make tipc node address support net namespace 2015-01-12 16:24:33 -05:00
addr.h tipc: make tipc node address support net namespace 2015-01-12 16:24:33 -05:00
bcast.c tipc: fix potential deadlock when all links are reset 2015-03-29 12:40:27 -07:00
bcast.h tipc: fix potential deadlock when all links are reset 2015-03-29 12:40:27 -07:00
bearer.c tipc: ensure that idle links are deleted when a bearer is disabled 2015-03-10 18:37:36 -04:00
bearer.h tipc: add ip/udp media type 2015-03-05 22:08:42 -05:00
core.c tipc: nl compat add noop and remove legacy nl framework 2015-02-09 13:20:49 -08:00
core.h tipc: remove tipc_snprintf 2015-02-09 13:20:49 -08:00
discover.c tipc: add framework for node capabilities exchange 2015-03-14 14:38:32 -04:00
discover.h tipc: involve namespace infrastructure 2015-01-12 16:24:32 -05:00
eth_media.c tipc: make media address offset a common define 2015-02-27 18:18:48 -05:00
ib_media.c tipc: rename media/msg related definitions 2015-02-27 18:18:48 -05:00
Kconfig tipc: add ip/udp media type 2015-03-05 22:08:42 -05:00
link.c tipc: fix potential deadlock when all links are reset 2015-03-29 12:40:27 -07:00
link.h tipc: eliminate race condition at dual link establishment 2015-03-25 14:05:56 -04:00
Makefile tipc: add ip/udp media type 2015-03-05 22:08:42 -05:00
msg.c tipc: clean up handling of message priorities 2015-03-14 14:38:32 -04:00
msg.h tipc: eliminate race condition at dual link establishment 2015-03-25 14:05:56 -04:00
name_distr.c tipc: only create header copy for name distr messages 2015-02-27 18:18:47 -05:00
name_distr.h tipc: resolve race problem at unicast message reception 2015-02-05 16:00:02 -08:00
name_table.c tipc: fix a potential deadlock when nametable is purged 2015-03-17 22:11:26 -04:00
name_table.h tipc: convert legacy nl name table dump to nl compat 2015-02-09 13:20:48 -08:00
net.c tipc: nl compat add noop and remove legacy nl framework 2015-02-09 13:20:49 -08:00
net.h tipc: make tipc node table aware of net namespace 2015-01-12 16:24:32 -05:00
netlink_compat.c tipc: nl compat add noop and remove legacy nl framework 2015-02-09 13:20:49 -08:00
netlink.c tipc: move and rename the legacy nl api to "nl compat" 2015-02-09 13:20:47 -08:00
netlink.h tipc: move and rename the legacy nl api to "nl compat" 2015-02-09 13:20:47 -08:00
node.c tipc: fix potential deadlock when all links are reset 2015-03-29 12:40:27 -07:00
node.h tipc: fix potential deadlock when all links are reset 2015-03-29 12:40:27 -07:00
server.c tipc: withdraw tipc topology server name when namespace is deleted 2015-03-17 22:11:26 -04:00
server.h tipc: make subscriber server support net namespace 2015-01-12 16:24:33 -05:00
socket.c rhashtable: Disable automatic shrinking by default 2015-03-24 17:48:40 -04:00
socket.h tipc: fix netns refcnt leak 2015-03-17 22:11:26 -04:00
subscr.c tipc: fix nullpointer bug when subscribing to events 2015-02-27 18:18:47 -05:00
subscr.h tipc: make subscriber server support net namespace 2015-01-12 16:24:33 -05:00
sysctl.c tipc: add name distributor resiliency queue 2014-09-01 17:51:48 -07:00
udp_media.c tipc: fix compile error when IPV6=m and TIPC=y 2015-03-24 15:20:29 -04:00