libertas: clean up scan thread handling

The libertas scan thread expects priv->scan_req to be non-NULL.  In theory,
it should always be set.  In practice, we've seen the following oops:

[ 8363.067444] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[ 8363.067490] pgd = c0004000
[ 8363.078393] [00000004] *pgd=00000000
[ 8363.086711] Internal error: Oops: 17 [#1] PREEMPT
[ 8363.091375] Modules linked in: fuse libertas_sdio libertas psmouse mousedev ov7670 mmp_camera joydev videobuf2_core videobuf2_dma_sg videobuf2_memops [last unloaded: scsi_wait_scan]
[ 8363.107490] CPU: 0    Not tainted  (3.0.0-gf7ccc69 #671)
[ 8363.112799] PC is at lbs_scan_worker+0x108/0x5a4 [libertas]
[ 8363.118326] LR is at 0x0
[ 8363.120836] pc : [<bf03a854>]    lr : [<00000000>]    psr: 60000113
[ 8363.120845] sp : ee66bf48  ip : 00000000  fp : 00000000
[ 8363.120845] r10: ee2c2088  r9 : c04e2efc  r8 : eef97005
[ 8363.132231] r7 : eee0716f  r6 : ee2c02c0  r5 : ee2c2088  r4 : eee07160
[ 8363.137419] r3 : 00000000  r2 : a0000113  r1 : 00000001  r0 : eee07160
[ 8363.143896] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[ 8363.157630] Control: 10c5387d  Table: 2e754019  DAC: 00000015
[ 8363.163334] Process kworker/u:1 (pid: 25, stack limit = 0xee66a2f8)

While I've not found a smoking gun, there are two places that raised red flags
for me.  The first is in _internal_start_scan, when we queue up a scan; we
first queue the worker, and then set priv->scan_req.  There's theoretically
a 50mS delay which should be plenty, but doing things that way just seems
racy (and not in the good way).

The second is in the scan worker thread itself.  Depending on the state of
priv->scan_channel, we cancel pending scan runs and then requeue a run in
300mS.  We then send the scan command down to the hardware, sleep, and if
we get scan results for all the desired channels, we set priv->scan_req to
NULL.  However, it that's happened in less than 300mS, what happens with
the pending scan run?

This patch addresses both of those concerns.  With the patch applied, we
have not seen the oops in the past two weeks.

Signed-off-by: Andres Salomon <dilinger@queued.net>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This commit is contained in:
Andres Salomon 2011-12-19 12:22:58 -08:00 committed by John W. Linville
parent 092fadb00c
commit afbca95f95

View File

@ -731,9 +731,11 @@ static void lbs_scan_worker(struct work_struct *work)
le16_to_cpu(scan_cmd->hdr.size),
lbs_ret_scan, 0);
if (priv->scan_channel >= priv->scan_req->n_channels)
if (priv->scan_channel >= priv->scan_req->n_channels) {
/* Mark scan done */
cancel_delayed_work(&priv->scan_work);
lbs_scan_done(priv);
}
/* Restart network */
if (carrier)
@ -762,12 +764,12 @@ static void _internal_start_scan(struct lbs_private *priv, bool internal,
request->n_ssids, request->n_channels, request->ie_len);
priv->scan_channel = 0;
queue_delayed_work(priv->work_thread, &priv->scan_work,
msecs_to_jiffies(50));
priv->scan_req = request;
priv->internal_scan = internal;
queue_delayed_work(priv->work_thread, &priv->scan_work,
msecs_to_jiffies(50));
lbs_deb_leave(LBS_DEB_CFG80211);
}