linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-16 17:23:55 +08:00

History

Ying Xue b952b2befb tipc: fix potential deadlock when all links are reset [ 60.988363] ====================================================== [ 60.988754] [ INFO: possible circular locking dependency detected ] [ 60.989152] 3.19.0+ #194 Not tainted [ 60.989377] ------------------------------------------------------- [ 60.989781] swapper/3/0 is trying to acquire lock: [ 60.990079] (&(&n_ptr->lock)->rlock){+.-...}, at: [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.990743] [ 60.990743] but task is already holding lock: [ 60.991106] (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.991738] [ 60.991738] which lock already depends on the new lock. [ 60.991738] [ 60.992174] [ 60.992174] the existing dependency chain (in reverse order) is: [ 60.992174] -> #1 (&(&bclink->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] [<ffffffffa0000f57>] tipc_bclink_add_node+0x97/0xf0 [tipc] [ 60.992174] [<ffffffffa0011815>] tipc_node_link_up+0xf5/0x110 [tipc] [ 60.992174] [<ffffffffa0007783>] link_state_event+0x2b3/0x4f0 [tipc] [ 60.992174] [<ffffffffa00193c0>] tipc_link_proto_rcv+0x24c/0x418 [tipc] [ 60.992174] [<ffffffffa0008857>] tipc_rcv+0x827/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] -> #0 (&(&n_ptr->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a8f7d>] __lock_acquire+0x163d/0x1ca0 [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.992174] [<ffffffffa0001e11>] tipc_bclink_rcv+0x611/0x640 [tipc] [ 60.992174] [<ffffffffa0008646>] tipc_rcv+0x616/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] [ 60.992174] other info that might help us debug this: [ 60.992174] [ 60.992174] Possible unsafe locking scenario: [ 60.992174] [ 60.992174] CPU0 CPU1 [ 60.992174] ---- ---- [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] [ 60.992174] * DEADLOCK * [ 60.992174] [ 60.992174] 3 locks held by swapper/3/0: [ 60.992174] #0: (rcu_read_lock){......}, at: [<ffffffff81646791>] __netif_receive_skb_core+0x71/0x980 [ 60.992174] #1: (rcu_read_lock){......}, at: [<ffffffffa0002c35>] tipc_l2_rcv_msg+0x5/0xd0 [tipc] [ 60.992174] #2: (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] The correct the sequence of grabbing n_ptr->lock and bclink->lock should be that the former is first held and the latter is then taken, which exactly happened on CPU1. But especially when the retransmission of broadcast link is failed, bclink->lock is first held in tipc_bclink_rcv(), and n_ptr->lock is taken in link_retransmit_failure() called by tipc_link_retransmit() subsequently, which is demonstrated on CPU0. As a result, deadlock occurs. If the order of holding the two locks happening on CPU0 is reversed, the deadlock risk will be relieved. Therefore, the node lock taken in link_retransmit_failure() originally is moved to tipc_bclink_rcv() so that it's obtained before bclink lock. But the precondition of the adjustment of node lock is that responding to bclink reset event must be moved from tipc_bclink_unlock() to tipc_node_unlock(). Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>		2015-03-29 12:40:27 -07:00
..
addr.c	tipc: make tipc node address support net namespace	2015-01-12 16:24:33 -05:00
addr.h	tipc: make tipc node address support net namespace	2015-01-12 16:24:33 -05:00
bcast.c	tipc: fix potential deadlock when all links are reset	2015-03-29 12:40:27 -07:00
bcast.h	tipc: fix potential deadlock when all links are reset	2015-03-29 12:40:27 -07:00
bearer.c	tipc: ensure that idle links are deleted when a bearer is disabled	2015-03-10 18:37:36 -04:00
bearer.h	tipc: add ip/udp media type	2015-03-05 22:08:42 -05:00
core.c	tipc: nl compat add noop and remove legacy nl framework	2015-02-09 13:20:49 -08:00
core.h	tipc: remove tipc_snprintf	2015-02-09 13:20:49 -08:00
discover.c	tipc: add framework for node capabilities exchange	2015-03-14 14:38:32 -04:00
discover.h	tipc: involve namespace infrastructure	2015-01-12 16:24:32 -05:00
eth_media.c	tipc: make media address offset a common define	2015-02-27 18:18:48 -05:00
ib_media.c	tipc: rename media/msg related definitions	2015-02-27 18:18:48 -05:00
Kconfig	tipc: add ip/udp media type	2015-03-05 22:08:42 -05:00
link.c	tipc: fix potential deadlock when all links are reset	2015-03-29 12:40:27 -07:00
link.h	tipc: eliminate race condition at dual link establishment	2015-03-25 14:05:56 -04:00
Makefile	tipc: add ip/udp media type	2015-03-05 22:08:42 -05:00
msg.c	tipc: clean up handling of message priorities	2015-03-14 14:38:32 -04:00
msg.h	tipc: eliminate race condition at dual link establishment	2015-03-25 14:05:56 -04:00
name_distr.c	tipc: only create header copy for name distr messages	2015-02-27 18:18:47 -05:00
name_distr.h	tipc: resolve race problem at unicast message reception	2015-02-05 16:00:02 -08:00
name_table.c	tipc: fix a potential deadlock when nametable is purged	2015-03-17 22:11:26 -04:00
name_table.h	tipc: convert legacy nl name table dump to nl compat	2015-02-09 13:20:48 -08:00
net.c	tipc: nl compat add noop and remove legacy nl framework	2015-02-09 13:20:49 -08:00
net.h	tipc: make tipc node table aware of net namespace	2015-01-12 16:24:32 -05:00
netlink_compat.c	tipc: nl compat add noop and remove legacy nl framework	2015-02-09 13:20:49 -08:00
netlink.c	tipc: move and rename the legacy nl api to "nl compat"	2015-02-09 13:20:47 -08:00
netlink.h	tipc: move and rename the legacy nl api to "nl compat"	2015-02-09 13:20:47 -08:00
node.c	tipc: fix potential deadlock when all links are reset	2015-03-29 12:40:27 -07:00
node.h	tipc: fix potential deadlock when all links are reset	2015-03-29 12:40:27 -07:00
server.c	tipc: withdraw tipc topology server name when namespace is deleted	2015-03-17 22:11:26 -04:00
server.h	tipc: make subscriber server support net namespace	2015-01-12 16:24:33 -05:00
socket.c	rhashtable: Disable automatic shrinking by default	2015-03-24 17:48:40 -04:00
socket.h	tipc: fix netns refcnt leak	2015-03-17 22:11:26 -04:00
subscr.c	tipc: fix nullpointer bug when subscribing to events	2015-02-27 18:18:47 -05:00
subscr.h	tipc: make subscriber server support net namespace	2015-01-12 16:24:33 -05:00
sysctl.c	tipc: add name distributor resiliency queue	2014-09-01 17:51:48 -07:00
udp_media.c	tipc: fix compile error when IPV6=m and TIPC=y	2015-03-24 15:20:29 -04:00