linux/block
Stanislaw Gruszka 9f53d2fe81 block: fix __blkdev_get and add_disk race condition
The following situation might occur:

__blkdev_get:			add_disk:

				register_disk()
get_gendisk()

disk_block_events()
	disk->ev == NULL

				disk_add_events()

__disk_unblock_events()
	disk->ev != NULL
	--ev->block

Then we unblock events, when they are suppose to be blocked. This can
trigger events related block/genhd.c warnings, but also can crash in
sd_check_events() or other places.

I'm able to reproduce crashes with the following scripts (with
connected usb dongle as sdb disk).

<snip>
DEV=/dev/sdb
ENABLE=/sys/bus/usb/devices/1-2/bConfigurationValue

function stop_me()
{
	for i in `jobs -p` ; do kill $i 2> /dev/null ; done
	exit
}

trap stop_me SIGHUP SIGINT SIGTERM

for ((i = 0; i < 10; i++)) ; do
	while true; do fdisk -l $DEV  2>&1 > /dev/null ; done &
done

while true ; do
echo 1 > $ENABLE
sleep 1
echo 0 > $ENABLE
done
</snip>

I use the script to verify patch fixing oops in sd_revalidate_disk
http://marc.info/?l=linux-scsi&m=132935572512352&w=2
Without Jun'ichi Nomura patch titled "Fix NULL pointer dereference in
sd_revalidate_disk" or this one, script easily crash kernel within
a few seconds. With both patches applied I do not observe crash.
Unfortunately after some time (dozen of minutes), script will hung in:

[ 1563.906432]  [<c08354f5>] schedule_timeout_uninterruptible+0x15/0x20
[ 1563.906437]  [<c04532d5>] msleep+0x15/0x20
[ 1563.906443]  [<c05d60b2>] blk_drain_queue+0x32/0xd0
[ 1563.906447]  [<c05d6e00>] blk_cleanup_queue+0xd0/0x170
[ 1563.906454]  [<c06d278f>] scsi_free_queue+0x3f/0x60
[ 1563.906459]  [<c06d7e6e>] __scsi_remove_device+0x6e/0xb0
[ 1563.906463]  [<c06d4aff>] scsi_forget_host+0x4f/0x60
[ 1563.906468]  [<c06cd84a>] scsi_remove_host+0x5a/0xf0
[ 1563.906482]  [<f7f030fb>] quiesce_and_remove_host+0x5b/0xa0 [usb_storage]
[ 1563.906490]  [<f7f03203>] usb_stor_disconnect+0x13/0x20 [usb_storage]

Anyway I think this patch is some step forward.

As drawback, I do not teardown on sysfs file create error, because I do
not know how to nullify disk->ev (since it can be used). However add_disk
error handling practically does not exist too, and things will work
without this sysfs file, except events will not be exported to user
space.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-03-02 10:44:17 +01:00
..
partitions separate partition format handling from generic code 2012-01-03 22:54:06 -05:00
blk-cgroup.c block: strip out locking optimization in put_io_context() 2012-02-07 07:51:30 +01:00
blk-cgroup.h block: fix a typo in the blk-cgroup.h file 2011-10-24 16:08:38 +02:00
blk-core.c block: don't call elevator callbacks for plug merges 2012-02-08 09:19:42 +01:00
blk-exec.c block: add missing blk_queue_dead() checks 2011-12-14 00:33:37 +01:00
blk-flush.c blk-flush: move the queue kick into 2011-10-24 16:24:31 +02:00
blk-integrity.c block: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros 2011-10-31 19:31:12 -04:00
blk-ioc.c block: exit_io_context() should call elevator_exit_icq_fn() 2012-02-15 09:45:53 +01:00
blk-iopoll.c tree-wide: fix assorted typos all over the place 2009-12-04 15:39:55 +01:00
blk-lib.c block: fix patch import error in max_discard_sectors check 2011-07-23 20:34:59 +02:00
blk-map.c block: re-use existing 'reading' variable instead of checking direction again 2011-12-21 15:27:24 +01:00
blk-merge.c block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator functions 2012-02-08 09:19:38 +01:00
blk-settings.c block: Introduce blk_set_stacking_limits function 2012-01-11 16:27:11 +01:00
blk-softirq.c block: Don't check QUEUE_FLAG_SAME_COMP in __blk_complete_request 2011-09-14 09:31:01 +02:00
blk-sysfs.c block, cfq: move io_cq exit/release to blk-ioc.c 2011-12-14 00:33:42 +01:00
blk-tag.c block: fix blk_queue_end_tag() 2011-12-29 09:16:28 +01:00
blk-throttle.c block: add blk_queue_dead() 2011-12-14 00:33:37 +01:00
blk-timeout.c fault-injection: add ability to export fault_attr in arbitrary directory 2011-08-03 14:25:20 -10:00
blk.h block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator functions 2012-02-08 09:19:38 +01:00
bsg-lib.c block: Change module.h -> export.h in bsg-lib.c 2011-10-31 19:31:13 -04:00
bsg.c bsg: fix sysfs link remove warning 2012-02-08 20:02:03 +01:00
cfq-iosched.c block: replace icq->changed with icq->flags 2012-02-15 09:45:49 +01:00
cfq.h blk-cgroup: Add unaccounted time to timeslice_used. 2011-03-12 16:54:00 +01:00
compat_ioctl.c block: Add BLKROTATIONAL ioctl 2012-01-11 16:29:31 +01:00
deadline-iosched.c block, cfq: move icq cache management to block core 2011-12-14 00:33:42 +01:00
elevator.c block: separate out blk_rq_merge_ok() and blk_try_merge() from elevator functions 2012-02-08 09:19:38 +01:00
genhd.c block: fix __blkdev_get and add_disk race condition 2012-03-02 10:44:17 +01:00
ioctl.c Merge branch 'for-3.3/core' of git://git.kernel.dk/linux-block 2012-01-15 12:24:45 -08:00
Kconfig move fs/partitions to block/ 2012-01-03 22:54:06 -05:00
Kconfig.iosched blk-cgroup: config options re-arrangement 2010-04-26 19:27:56 +02:00
Makefile separate partition format handling from generic code 2012-01-03 22:54:06 -05:00
noop-iosched.c block, cfq: move icq cache management to block core 2011-12-14 00:33:42 +01:00
partition-generic.c block: Fix NULL pointer dereference in sd_revalidate_disk 2012-03-02 10:38:33 +01:00
scsi_ioctl.c block: fail SCSI passthrough ioctls on partition devices 2012-01-14 15:07:24 -08:00