IB/ipoib: Fix race between light events and interface restart

A potential race between light_event and interface restart
may attach multicast group to an already attached QP.

Scenario:
light_event flow goes through ipoib_mcast_dev_flush function,
if a context switch occurs before calling ipoib_mcast_remove_list,
then we may face a situation where the broadcast of the priv is null
and the corresponding QP is not detached yet.
If an "interface restart" runs during the previous context switch,
the following scenario occurs:
When the device goes up, ipoib_ib_dev_up function will be called,
it will send a new registration request to the broadcast group and then
attach the group to the QP that was not detached before.

     IPOIB_FLUSH_LIGHT                                          INTERFACE RESTART

    __ipoib_ib_dev_flush                                                |
        |                                                               |
        |                                                               |
        |                                                               |
    ipoib_mcast_dev_flush                                               |
    Move mcast list and broadcast to remove_list                        |
        |                                                               |
        |                                                               |
    Context Switch-->                                                   |
        |                                                       ipoib_ib_dev_down
        |                                                               |
        |                                                               |
        |                                                       ipoib_ib_dev_up
        |                                                               |
        |                                                               |
        |                                                       ipoib_mcast_join_task
        |                                                       allocate new broadcast
        |                                                               |
        |                                                               |
        |                                                       Attach QP to multicast group
        |                                                               |
        |                                                               |
        |                                                       <--Context Switch
    ipoib_mcast_leave
    Detach QP from multicast group

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
This commit is contained in:
Feras Daoud 2017-07-10 18:45:41 +03:00 committed by Leon Romanovsky
parent a62ab66b13
commit edf3f301db
3 changed files with 4 additions and 0 deletions

View File

@ -336,6 +336,7 @@ struct ipoib_dev_priv {
unsigned long flags;
struct rw_semaphore vlan_rwsem;
struct mutex mcast_mutex;
struct rb_root path_tree;
struct list_head path_list;

View File

@ -1877,6 +1877,7 @@ static void ipoib_build_priv(struct net_device *dev)
priv->dev = dev;
spin_lock_init(&priv->lock);
init_rwsem(&priv->vlan_rwsem);
mutex_init(&priv->mcast_mutex);
INIT_LIST_HEAD(&priv->path_list);
INIT_LIST_HEAD(&priv->child_intfs);

View File

@ -838,6 +838,7 @@ void ipoib_mcast_dev_flush(struct net_device *dev)
struct ipoib_mcast *mcast, *tmcast;
unsigned long flags;
mutex_lock(&priv->mcast_mutex);
ipoib_dbg_mcast(priv, "flushing multicast list\n");
spin_lock_irqsave(&priv->lock, flags);
@ -865,6 +866,7 @@ void ipoib_mcast_dev_flush(struct net_device *dev)
wait_for_completion(&mcast->done);
ipoib_mcast_remove_list(&remove_list);
mutex_unlock(&priv->mcast_mutex);
}
static int ipoib_mcast_addr_is_valid(const u8 *addr, const u8 *broadcast)