linux/drivers/infiniband/hw
Faisal Latif 5962c2c803 RDMA/nes: Fix nes_nic_cm_xmit() error handling
We are getting crash or hung situation when we are running network
cable pull tests during RDMA traffic.

In schedule_nes_timer(), we return an error if nes_nic_cm_xmit()
returns failure.  This is changed to success as skb is being put on
the timer routines to be processed later.  In send_syn() case, we are
indicating connect failure once from nes_connect() and the other when
the rexmit retries expires.

The other issue is skb->users which we are incrementing before calling
nes_nic_cm_xmit() which calls dev_queue_xmit() but in case of failure
we are decrementing the skb->users at the same time putting the skb on
the rexmit path.  Even if dev_queue_xmit() fails, the skb->users is
decremented already.  We are removing the decrement of skb->users in
case of failure from both schedule_nes_timer() as well as from
nes_cm_timer_tick().

There is also extra check in nes_cm_timer_tick() for rexmit failure
which does a break from the loop is removed.  This causes problem as
the other nodes have their cm_node->ref_count incremented and are not
processed.

Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-04-08 14:23:55 -07:00
..
amso1100 infiniband: convert c2 to net_device_ops 2009-03-21 19:19:13 -07:00
cxgb3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2009-03-26 15:54:36 -07:00
ehca IB: Remove __constant_{endian} uses 2009-01-17 17:11:57 -08:00
ipath Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-03-26 16:05:42 -07:00
mlx4 Merge branches 'cxgb3', 'endian', 'ipath', 'ipoib', 'iser', 'mad', 'misc', 'mlx4', 'mthca', 'nes' and 'sysfs' into for-next 2009-03-24 20:44:41 -07:00
mthca IB/mthca: Fix dispatch of IB_EVENT_LID_CHANGE event 2009-01-28 15:15:56 -08:00
nes RDMA/nes: Fix nes_nic_cm_xmit() error handling 2009-04-08 14:23:55 -07:00